Commit Graph

20753 Commits

Author SHA1 Message Date
Sanjay Patel 306f14ceb8 [InstCombine] clean up foldVectorBinop(); NFC
1. Fix include ordering.
2. Improve variable name (width is bitwidth not number-of-elements).
3. Add local Opcode variable to reduce code duplication.

llvm-svn: 343694
2018-10-03 15:46:03 +00:00
Sanjay Patel 79dceb2903 [InstCombine] name change: foldShuffledBinop -> foldVectorBinop; NFC
This function will deal with more than shuffles with D50992, and I 
have another potential per-element fold that could live here.

llvm-svn: 343692
2018-10-03 15:20:58 +00:00
Florian Hahn 11a1423348 [LoopInterchange] Remove unused variable PreserveLCSSA (NFC).
llvm-svn: 343676
2018-10-03 11:01:23 +00:00
Aditya Kumar a27014b851 Improve static analysis of cold basic blocks
Differential Revision: https://reviews.llvm.org/D52704

Reviewers: sebpop, tejohnson, brzycki, SirishP
Reviewed By: sebpop

llvm-svn: 343663
2018-10-03 06:21:05 +00:00
Aditya Kumar 9e20ade72a Add support for new pass manager
Modified the testcases to use both pass managers
Use single commandline flag for both pass managers.

Differential Revision: https://reviews.llvm.org/D52708
Reviewers: sebpop, tejohnson, brzycki, SirishP
Reviewed By: tejohnson, brzycki

llvm-svn: 343662
2018-10-03 05:55:20 +00:00
David Green 1e44c3b62c [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A
This is an attempt to get out of a local-minimum that instcombine currently
gets stuck in. We essentially combine two optimisations at once, ~a - ~b = b-a
and min(~a, ~b) = ~max(a, b), only doing the transform if the result is at
least neutral. This involves using IsFreeToInvert, which has been expanded a
little to include selects that can be easily inverted.

This is trying to fix PR35875, using the ideas from Sanjay. It is a large
improvement to one of our rgb to cmy kernels.

Differential Revision: https://reviews.llvm.org/D52177

llvm-svn: 343569
2018-10-02 09:48:34 +00:00
Craig Topper d616d33a96 [SimplifyCFG] Use Value::hasNUses instead of 'getNumUses() =='. NFCI
getNumUses is linear in the number of uses. Since we're looking for a specific use count, we can use hasNUses which will stop as soon as it determines there are more than N uses instead of walking all of them.

llvm-svn: 343550
2018-10-01 23:09:52 +00:00
Craig Topper 90c0a0621c [SimplifyCFG] Update comments that refer to CondBB to say ThenBB instead. NFC
There is no variable in this function named CondBB, but there is one named ThenBB and I believe the comments are all refering to it.

llvm-svn: 343548
2018-10-01 22:56:11 +00:00
Eric Christopher dcf1d97c5c Temporarily revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r342387 as it's showing significant performance
regressions in a number of benchmarks. Followed up with the
committer and original thread with an example and will get performance
numbers before recommitting.

llvm-svn: 343522
2018-10-01 18:57:08 +00:00
Jesper Antonsson c954b86391 [InstCombine] Handle vector compares in foldGEPIcmp(), take 2
Summary:
This is a continuation of the fix for PR34627 "InstCombine assertion at vector gep/icmp folding". (I just realized bugpoint had fuzzed the original test for me, so I had fixed another trigger of the same assert in adjacent code in InstCombine.)

This patch avoids optimizing an icmp (to look only at the base pointers) when the resulting icmp would have a different type.

The patch adds a testcase and also cleans up and shrinks the pre-existing test for the adjacent assert trigger.

Reviewers: lebedev.ri, majnemer, spatel

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52494

llvm-svn: 343486
2018-10-01 14:59:25 +00:00
Sanjay Patel 31b07198f1 [InstCombine] try to convert vector insert+extract to trunc; 2nd try
This was originally committed at rL343407, but reverted at 
rL343458 because it crashed trying to handle a case where
the destination type is FP. This version of the patch adds
a check for that possibility. Tests added at rL343480.

Original commit message:

This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the
extract index doesn't correspond to the LSBs of the scalar, then we
have to shift-right before the truncate. Endian-ness makes this tricky,
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343482
2018-10-01 14:40:00 +00:00
Hans Wennborg a60aa91374 Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"
This caused Chromium builds to fail with "Illegal Trunc" assertion.
See https://crbug.com/890723 for repro.

> This transform is requested for the backend in:
> https://bugs.llvm.org/show_bug.cgi?id=39016
> ...but I figured it was worth doing in IR too, and it's probably
> easier to implement here, so that's this patch.
>
> In the simplest case, we are just truncating a scalar value. If the
> extract index doesn't correspond to the LSBs of the scalar, then we
> have to shift-right before the truncate. Endian-ness makes this tricky,
> but hopefully the ASCII-art helps visualize the transform.
>
> Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343458
2018-10-01 12:07:45 +00:00
Florian Hahn 8600fee52e Recommit r343308: [LoopInterchange] Turn into a loop pass.
llvm-svn: 343450
2018-10-01 09:59:48 +00:00
Fangrui Song 3507c6e884 Use the container form llvm::sort(C, ...)
There are a few leftovers in rL343163 which span two lines. This commit
changes these llvm::sort(C.begin(), C.end, ...) to llvm::sort(C, ...)

llvm-svn: 343426
2018-09-30 22:31:29 +00:00
Sanjay Patel 1e0f1f645a [InstCombine] try to convert vector insert+extract to trunc
This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably 
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the 
extract index doesn't correspond to the LSBs of the scalar, then we 
have to shift-right before the truncate. Endian-ness makes this tricky, 
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343407
2018-09-30 14:34:01 +00:00
Sanjay Patel 26c119a9c2 [InstCombine] allow lengthening of insertelement to eliminate shuffles
As noted in post-commit comments for D52548, the limitation on 
increasing vector length can be applied by opcode.
As a first step, this patch only allows insertelement to be
widened because that has no logical downsides for IR and has 
little risk of pessimizing codegen.

This may cause PR39132 to go into hiding during a full compile,
but that bug is not fixed.

llvm-svn: 343406
2018-09-30 13:50:42 +00:00
Sanjay Patel 54d31ef87e [InstCombine] fix formatting in vector evaluators; NFC
We need to alter the functionality as shown in D52548.

llvm-svn: 343379
2018-09-29 15:05:24 +00:00
Vitaly Buka 0509070811 [cxx2a] Fix warning triggered by r343285
llvm-svn: 343369
2018-09-29 02:17:12 +00:00
whitequark 29b2980159 Revert "[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints"
This reverts commit c4baf7c2f06ff5459c4f5998ce980346e72bff97.

Broke the bots, and should really be in Transforms/Coroutines
instead.

llvm-svn: 343337
2018-09-28 16:45:18 +00:00
whitequark 937afbc365 [LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints
Summary: This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: mehdi_amini, modocache, llvm-commits

Differential Revision: https://reviews.llvm.org/D51642

llvm-svn: 343336
2018-09-28 16:38:11 +00:00
Sanjay Patel 242f90fe82 [InstCombine] don't propagate wider shufflevector arguments to predecessors
InstCombine would propagate shufflevector insts that had wider output vectors onto 
predecessors, which would sometimes push undef's onto the divisor of a div/rem and 
result in bad codegen.

I've fixed this by just banning propagating shufflevector back if the result of 
the shufflevector is wider than the input vectors.

Patch by: @sheredom (Neil Henning)

Differential Revision: https://reviews.llvm.org/D52548

llvm-svn: 343329
2018-09-28 15:24:41 +00:00
Florian Hahn 8d72ecc36f Revert r343308: [LoopInterchange] Turn into a loop pass.
llvm-svn: 343310
2018-09-28 10:20:07 +00:00
Florian Hahn 0694c159f7 [LoopInterchange] Turn into a loop pass.
This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.

The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.

It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.


Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D51702

llvm-svn: 343308
2018-09-28 09:45:50 +00:00
Sanjay Patel c3f50ff92e [InstCombine] Without infinites, fold (C / X) < 0.0 --> (X < 0)
When C is not zero and infinites are not allowed (C / X) > 0 is a sign
test. Depending on the sign of C, the predicate must be swapped.

E.g.:
  foo(double X) {
    if ((-2.0 / X) <= 0) ...
  }
 =>
  foo(double X) {
    if (X >= 0) ...
  }

Patch by: @marels (Martin Elshuber)

Differential Revision: https://reviews.llvm.org/D51942

llvm-svn: 343228
2018-09-27 15:59:24 +00:00
Teresa Johnson f24136f17a [WPD] Fix incorrect devirtualization after indirect call promotion
Summary:
Add a dominance check to ensure that the possible devirtualizable
call is actually dominated by the type test/checked load intrinsic being
analyzed. With PGO, after indirect call promotion is performed during
the compile step, followed by inlining, we may have a type test in the
promoted and inlined sequence that allows an indirect call in that
sequence to be devirtualized. That indirect call (inserted by inlining
after promotion) will share the same vtable pointer as the fallback
indirect call that cannot be devirtualized.

Before this patch the code was incorrectly devirtualizing the fallback
indirect call.

See the new test and the example described there for more details.

Reviewers: pcc, vitalybuka

Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52514

llvm-svn: 343226
2018-09-27 14:55:32 +00:00
Fangrui Song 0cac726a00 llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Florian Hahn 6feb637124 [LoopInterchange] Preserve LCSSA.
This patch extends LoopInterchange to move LCSSA to the right place
after interchanging. This is required for LoopInterchange to become a
function pass.

An alternative to the manual moving of the PHIs, we could also re-form
the LCSSA phis for a set of interchanged loops, but that's more
expensive.

Reviewers: efriedma, mcrosier, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D52154

llvm-svn: 343132
2018-09-26 19:34:25 +00:00
Vyacheslav Zakharin e06831a3b2 Remove LoopID metadata from the branch instruction
that follows the peeled iterations.

Differential Revision: https://reviews.llvm.org/D52176

llvm-svn: 343054
2018-09-26 01:03:21 +00:00
Zhaoshi Zheng 95710337b4 Revert "Revert "[ConstHoist] Do not rebase single (or few) dependent constant""
This reverts commit bd7b44f35ee9fbe365eb25ce55437ea793b39346.

Reland r342994: disabled the optimization and explicitly enable it in test.

-mllvm -consthoist-min-num-to-rebase<unsigned>=0

[ConstHoist] Do not rebase single (or few) dependent constant

If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.

Differential Revision: https://reviews.llvm.org/D52243

llvm-svn: 343053
2018-09-26 00:59:09 +00:00
Anna Thomas b1e3d45318 [LV][LAA] Vectorize loop invariant values stored into loop invariant address
Summary:
We are overly conservative in loop vectorizer with respect to stores to loop
invariant addresses.
More details in https://bugs.llvm.org/show_bug.cgi?id=38546
This is the first part of the fix where we start with vectorizing loop invariant
values to loop invariant addresses.

This also includes changes to ORE for stores to invariant address.

Reviewers: anemet, Ayal, mkuper, mssimpso

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50665

llvm-svn: 343028
2018-09-25 20:57:20 +00:00
Jessica Paquette e02de05b32 Revert "[ConstHoist] Do not rebase single (or few) dependent constant"
This caused a couple test failures on a bot:

CodeGen/X86/constant-hoisting-bfi.ll
Transforms/ConstantHoisting/X86/ehpad.ll

Example:

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53575/

llvm-svn: 343005
2018-09-25 18:41:40 +00:00
Zhaoshi Zheng 2c1a09188f [ConstHoist] Do not rebase single (or few) dependent constant
If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.

Differential Revision: https://reviews.llvm.org/D52243

llvm-svn: 342994
2018-09-25 17:45:37 +00:00
Sanjay Patel 69ed4710b8 [InstCombine] narrow binops on concatenated vectors (PR33026)
The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=33026
...has no shuffles now. This kind of pattern may occur during
vectorization when targets have lumpy ISAs like SSE/AVX.

llvm-svn: 342988
2018-09-25 15:57:37 +00:00
David Green 9108c2b921 [LoopUnroll] Add check to Latch's terminator in UnrollRuntimeLoopRemainder
In this patch, I'm adding an extra check to the Latch's terminator in llvm::UnrollRuntimeLoopRemainder,
similar to how it is already done in the llvm::UnrollLoop.

The compiler would crash if this function is called with a malformed loop.

Patch by Rodrigo Caetano Rocha!

Differential Revision: https://reviews.llvm.org/D51486

llvm-svn: 342958
2018-09-25 10:08:47 +00:00
Evgeniy Stepanov 090f0f9504 [hwasan] Record and display stack history in stack-based reports.
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342923
2018-09-24 23:03:34 +00:00
Evgeniy Stepanov 20c4999e8b Revert "[hwasan] Record and display stack history in stack-based reports."
This reverts commit r342921: test failures on clang-cmake-arm* bots.

llvm-svn: 342922
2018-09-24 22:50:32 +00:00
Evgeniy Stepanov 9043e17edd [hwasan] Record and display stack history in stack-based reports.
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342921
2018-09-24 21:38:42 +00:00
Christy Lee e94374809e Re-submitting changes in D51550 because it failed to patch.
Reviewers: javed.absar, trentxintong, courbet

Reviewed By: trentxintong

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52433

llvm-svn: 342919
2018-09-24 20:47:12 +00:00
Sanjay Patel 4674c7765d [InstCombine] add bitcast+extelt helper function; NFC
We can handle patterns where the elements have different 
sizes, so refactoring ahead of trying to add another blob
within these clauses.

llvm-svn: 342918
2018-09-24 20:41:22 +00:00
Sanjay Patel 7a52626a08 [InstCombine] improve variable name and use 'match'; NFC
'width' of a vector usually refers to the bit-width.

https://bugs.llvm.org/show_bug.cgi?id=39016
shows a case where we could extend this fold to handle
a case where the number of elements in the bitcasted
vector is not equal to the resulting value.

llvm-svn: 342902
2018-09-24 16:39:03 +00:00
Petar Jovanovic c451c9ef50 [deadargelim] Update dbg.value of 'unused' parameters
DeadArgElim pass marks unused function arguments as ‘undef’ without updating
existing dbg.values referring to it. As a consequence the debug info
metadata in the final executable was wrong.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D51968

llvm-svn: 342871
2018-09-24 10:01:24 +00:00
Eugene Leviant 2b70d616f0 [WholeProgramDevirt] Don't process declarations when building type id map
Differential revision: https://reviews.llvm.org/D52175

llvm-svn: 342836
2018-09-23 13:27:47 +00:00
Sanjay Patel 09e02fbf51 [InstCombine][x86] try even harder to convert blendv intrinsic to generic IR (PR38814)
Follow-up to rL342324 (D52059):

Missing optimizations with blendv are shown in:
https://bugs.llvm.org/show_bug.cgi?id=38814

This is an easier and more powerful solution than adding pattern matching for a few 
special cases in the backend. The potential danger with this transform in IR is that 
the condition value can get separated from the select, and the backend might not be 
able to make a blendv out of it again.

llvm-svn: 342806
2018-09-22 14:43:55 +00:00
Craig Topper 2b3f5df73a [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible
Summary: This restores the combine that was reverted in r341883. The infinite loop from the failing test no longer occurs due to changes from r342163.

Reviewers: spatel, dmgreen

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52070

llvm-svn: 342797
2018-09-22 05:53:27 +00:00
Warren Ristow 4f27730eaf [Loop Vectorizer] Abandon vectorization when no integer IV found
Support for vectorizing loops with secondary floating-point induction
variables was added in r276554.  A primary integer IV is still required
for vectorization to be done.  If an FP IV was found, but no integer IV
was found at all (primary or secondary), the attempt to vectorize still
went forward, causing a compiler-crash.  This change abandons that
attempt when no integer IV is found.  (Vectorizing FP-only cases like
this, rather than bailing out, is discussed as possible future work
in D52327.)

See PR38800 for more information.

Differential Revision: https://reviews.llvm.org/D52327

llvm-svn: 342786
2018-09-21 23:03:50 +00:00
Sameer Sahasrabuddhe 0807e94951 revert changes from r342722
"[AMDGPU] lower-switch in preISel as a workaround for legacy DA"

This broke regression tests. The first breakage was noticed here:
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/23549

llvm-svn: 342743
2018-09-21 16:31:51 +00:00
Sameer Sahasrabuddhe 2de7653fd5 [AMDGPU] lower-switch in preISel as a workaround for legacy DA
Summary:
The default target of the switch instruction may sometimes be an
"unreachable" block, when it is guaranteed that one of the cases is
always taken. The dominator tree concludes that such a switch
instruction does not have an immediate post dominator. This confuses
divergence analysis, which is unable to propagate sync dependence to
the targets of the switch instruction.

As a workaround, the AMDGPU target now invokes lower-switch as a
preISel pass. LowerSwitch is designed to handle the unreachable
default target correctly, allowing the divergence analysis to locate
the correct immediate dominator of the now-lowered switch.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, simoll

Differential Revision: https://reviews.llvm.org/D52221

llvm-svn: 342722
2018-09-21 11:26:55 +00:00
JF Bastien 73d8e4e531 Merge clang's isRepeatedBytePattern with LLVM's isBytewiseValue
Summary:
his code was in CGDecl.cpp and really belongs in LLVM's isBytewiseValue. Teach isBytewiseValue the tricks clang's isRepeatedBytePattern had, including merging undef properly, and recursing on more types.

clang part of this patch: D51752

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51751

llvm-svn: 342709
2018-09-21 05:17:42 +00:00
Xin Tong 9b7e45159f [GlobalDCE] AvailableExternal linkage is checked in isDiscardableIfUnused [NFC].
Summary:
AvailableExternal was not handled in isDiscardableIfUnused when isDiscardableIfUnused
was added in r158476. Till it was handled in r247044. This is a NFC.

Reviewers: pcc, tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52319

llvm-svn: 342684
2018-09-20 21:16:16 +00:00
Fedor Sergeev ee8d31c49e [New PM] Introducing PassInstrumentation framework
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

  Made getName helper to return std::string (instead of StringRef initially) to fix
  asan builtbot failures on CGSCC tests.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342664
2018-09-20 17:08:45 +00:00
Jesper Antonsson 719fa055d0 [InstCombine] Handle vector compares in foldGEPIcmp()
Summary:
This is to fix PR38984 "InstCombine assertion at vector gep/icmp folding":
https://bugs.llvm.org/show_bug.cgi?id=38984

Reviewers: majnemer, spatel, lattner, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D52263

llvm-svn: 342647
2018-09-20 13:37:28 +00:00
Bjorn Pettersson cd53b7f54e [IPSCCP] Fix a problem with removing labels in a switch with undef condition
Summary:
Before removing basic blocks that ipsccp has considered as dead
all uses of the basic block label must be removed. That is done
by calling ConstantFoldTerminator on the users. An exception
is when the branch condition is an undef value. In such
scenarios ipsccp is using some internal assumptions regarding
which edge in the control flow that should remain, while
ConstantFoldTerminator don't know how to fold the terminator.

The problem addressed here is related to ConstantFoldTerminator's
ability to rewrite a 'switch' into a conditional 'br'. In such
situations ConstantFoldTerminator returns true indicating that
the terminator has been rewritten. However, ipsccp treated the
true value as if the edge to the dead basic block had been
removed. So the code for resolving an undef branch condition
did not trigger, and we ended up with assertion that there were
uses remaining when deleting the basic block.

The solution is to resolve indeterminate branches before the
call to ConstantFoldTerminator.

Reviewers: efriedma, fhahn, davide

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52232

llvm-svn: 342632
2018-09-20 09:00:17 +00:00
Calixte Denizet eb7f60201c [IR] Add a boolean field in DILocation to know if a line must covered or not
Summary:
Some lines have a hit counter where they should not have one.
For example, in C++, some cleanup is adding at the end of a scope represented by a '}'.
So such a line has a hit counter where a user expects to not have one.
The goal of the patch is to add this information in DILocation which is used to get the covered lines in GCOVProfiling.cpp.
A following patch in clang will add this information when generating IR (https://reviews.llvm.org/D49916).

Reviewers: marco-c, davidxl, vsk, javed.absar, rnk

Reviewed By: rnk

Subscribers: eraman, xur, danielcdh, aprantl, rnk, dblaikie, #debug-info, vsk, llvm-commits, sylvestre.ledru

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D49915

llvm-svn: 342631
2018-09-20 08:53:06 +00:00
Eric Christopher 019889374b Temporarily Revert "[New PM] Introducing PassInstrumentation framework"
as it was causing failures in the asan buildbot.

This reverts commit r342597.

llvm-svn: 342616
2018-09-20 05:16:29 +00:00
Fedor Sergeev a5f279ea89 [New PM] Introducing PassInstrumentation framework
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342597
2018-09-19 22:42:57 +00:00
Matt Morehouse e62fc3d0b6 [InstCombine] Disable strcmp->memcmp transform for MSan.
Summary:
The strcmp->memcmp transform can make the resulting memcmp read
uninitialized data, which MSan doesn't like.

Resolves https://github.com/google/sanitizers/issues/993.

Reviewers: eugenis, xbolva00

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D52272

llvm-svn: 342582
2018-09-19 19:37:24 +00:00
Fedor Sergeev 25de3f83be Revert rL342544: [New PM] Introducing PassInstrumentation framework
A bunch of bots fail to compile unittests. Reverting.

llvm-svn: 342552
2018-09-19 14:54:48 +00:00
Roman Lebedev f50023d37c [InstCombine] foldICmpWithLowBitMaskedVal(): handle uncanonical ((-1 << y) >> y) mask
Summary:
The last low-bit-mask-pattern-producing-pattern i can think of.

https://rise4fun.com/Alive/UGzE <- non-canonical
But we can not canonicalize it because of extra uses.

https://bugs.llvm.org/show_bug.cgi?id=38123

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52148

llvm-svn: 342548
2018-09-19 13:35:46 +00:00
Roman Lebedev ca2bdb03d6 [InstCombine] foldICmpWithLowBitMaskedVal(): handle uncanonical ((1 << y)+(-1)) mask
Summary:
Same as to D52146.
`((1 << y)+(-1))` is simply non-canoniacal version of `~(-1 << y)`: https://rise4fun.com/Alive/0vl
We can not canonicalize it due to the extra uses. But we can handle it here.

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52147

llvm-svn: 342547
2018-09-19 13:35:40 +00:00
Roman Lebedev 183a465dc6 [InstCombine] foldICmpWithLowBitMaskedVal(): handle ~(-1 << y) mask
Summary:
Two folds are happening here:
1. https://rise4fun.com/Alive/oaFX
2. And then `foldICmpWithHighBitMask()` (D52001): https://rise4fun.com/Alive/wsP4

This change doesn't just add the handling for eq/ne predicates,
it actually builds upon the previous `foldICmpWithLowBitMaskedVal()` work,
so **all** the 16 fold variants* are immediately supported.

I'm indeed only testing these two predicates.
I do not feel like re-proving all 16 folds*, because they were already proven
for the general case of constant with all-ones in low bits. So as long as
the mask produces all-ones in low bits, i'm pretty sure the fold is valid.

But required, i can re-prove, let me know.

* eq/ne are commutative - 4 folds; ult/ule/ugt/uge - are not commutative (the commuted variant is InstSimplified), 4 folds; slt/sle/sgt/sge are not commutative - 4 folds. 12 folds in total.

https://bugs.llvm.org/show_bug.cgi?id=38123
https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52146

llvm-svn: 342546
2018-09-19 13:35:27 +00:00
Fedor Sergeev 875c938fec [New PM] Introducing PassInstrumentation framework
Summary:
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342544
2018-09-19 12:25:52 +00:00
Benjamin Kramer e5e1ea79fd [InstCombine] Don't transform sin/cos -> tanl if for half types
This is still unsafe for long double, we will transform things into tanl
even if tanl is for another type. But that's for someone else to fix.

llvm-svn: 342542
2018-09-19 12:01:38 +00:00
Carlos Alberto Enciso ba4e437c6a [DebugInfo][Dexter] Speculated BB presents illegal variable value to debugger.
When SimplifyCFG changes the PHI node into a select instruction, the debug information becomes ambiguous. It causes the debugger to display wrong variable value. 

Differential Revision: https://reviews.llvm.org/D51976

llvm-svn: 342527
2018-09-19 08:16:56 +00:00
Christy Lee c85da8bd9a Do not optimize atomic load to non-atomic memcmp
Differential Revision: https://reviews.llvm.org/D51998

llvm-svn: 342498
2018-09-18 17:02:42 +00:00
Hiroshi Yamauchi fd2c699dd6 [PGO][CHR] Add opt remarks.
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52056

llvm-svn: 342495
2018-09-18 16:50:10 +00:00
Teresa Johnson 5e1c0e7610 [LTO] Make detection of WPD remark enablement more robust
Summary:
Currently only the first function in the module is checked to
see if it has remarks enabled. If that first function is a declaration,
remarks will be incorrectly skipped. Change to look for the first
non-empty function.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51556

llvm-svn: 342477
2018-09-18 13:42:24 +00:00
whitequark e7e14f464c [LLVM-C][OCaml] Add UnifyFunctionExitNodes pass to C and OCaml APIs
Summary:
Adds LLVMAddUnifyFunctionExitNodesPass to expose
createUnifyFunctionExitNodesPass to the C and OCaml APIs.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52212

llvm-svn: 342476
2018-09-18 13:36:03 +00:00
whitequark 1f50560e56 [LLVM-C][OCaml] Add LowerAtomic pass to C and OCaml APIs
Summary:
Adds LLVMAddLowerAtomicPass to expose createLowerAtomicPass in the C
and OCaml APIs.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52211

llvm-svn: 342475
2018-09-18 13:35:50 +00:00
Max Kazantsev 0994abda3a [IndVars] Remove unreasonable checks in rewriteLoopExitValues
A piece of logic in rewriteLoopExitValues has a weird check on number of
users which allowed an unprofitable transform in case if an instruction has
more than 6 users.

Differential Revision: https://reviews.llvm.org/D51404
Reviewed By: etherzhhb

llvm-svn: 342444
2018-09-18 04:57:18 +00:00
Matt Arsenault c640798597 LSV: Fix adjust alloca alignment trick for AMDGPU
This was checking the hardcoded address space 0 for the stack.
Additionally, this should be checking for legality with
the adjusted alignment, so defer the alignment check.

Also try to split if the unaligned access isn't allowed.

llvm-svn: 342442
2018-09-18 02:05:44 +00:00
Alina Sbirlea a782a70ad9 [EarlyCSEwMemorySSA] Add MSSA verification and tests to make EarlyCSE failures easier to track.
Summary:
EarlyCSE can make IR changes that will leave MemorySSA with accesses claiming to be optimized, but for which a subsequent MemorySSA run will yield a different optimized result.
Due to relying on AA queries, we can't fix this in general, unless we recompute MemorySSA.
Adding some tests to track this and a basic verify for future potential failures.

Reviewers: george.burgess.iv, gberry

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D51960

llvm-svn: 342422
2018-09-17 22:35:21 +00:00
Xin Tong 8a505c64b0 [CVP] Handle instructions with no user. No need to create CVPLattice state. This handles terminator instructions and more.
Summary:
I tested this patch by compiling sqlite3.ll (clang -O3 -mllvm -disable-llvm-optzns sqlite3.c.)

 opt -called-value-propagation sqlite3.ll -time-passes -f -o out.ll

 I get 10+% speedup for the pass. I expect some of the gain come from skipping terminator instructions.

    === BEFORE THE PATCH ===
    ===-------------------------------------------------------------------------===
                          ... Pass execution timing report ...
    ===-------------------------------------------------------------------------===
      Total Execution Time: 0.5562 seconds (0.5582 wall clock)

       ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
       0.2485 ( 46.4%)   0.0120 ( 57.7%)   0.2605 ( 46.8%)   0.2615 ( 46.8%)  Bitcode Writer
       0.1607 ( 30.0%)   0.0079 ( 37.7%)   0.1685 ( 30.3%)   0.1693 ( 30.3%)  Called Value Propagation
       0.1262 ( 23.6%)   0.0009 (  4.5%)   0.1271 ( 22.9%)   0.1275 ( 22.8%)  Module Verifier
       0.5353 (100.0%)   0.0209 (100.0%)   0.5562 (100.0%)   0.5582 (100.0%)  Total

    === AFTER THE PATCH ===
    ===-------------------------------------------------------------------------===
                          ... Pass execution timing report ...
    ===-------------------------------------------------------------------------===
      Total Execution Time: 0.5338 seconds (0.5355 wall clock)

       ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
       0.2498 ( 48.6%)   0.0118 ( 59.3%)   0.2615 ( 49.0%)   0.2629 ( 49.1%)  Bitcode Writer
       0.1377 ( 26.8%)   0.0075 ( 37.8%)   0.1452 ( 27.2%)   0.1455 ( 27.2%)  Called Value Propagation
       0.1264 ( 24.6%)   0.0006 (  3.0%)   0.1270 ( 23.8%)   0.1271 ( 23.7%)  Module Verifier
       0.5139 (100.0%)   0.0199 (100.0%)   0.5338 (100.0%)   0.5355 (100.0%)  Total

Reviewers: davide, mssimpso

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49108

llvm-svn: 342398
2018-09-17 15:28:01 +00:00
Alexandros Lamprineas 8a1c374b2e [GVNHoist] Re-enable GVNHoist by default
Rebase rL341954 since https://bugs.llvm.org/show_bug.cgi?id=38912
has been fixed by rL342055.

Precommit testing performed:
* Overnight runs of csmith comparing the output between programs
  compiled with gvn-hoist enabled/disabled.
* Bootstrap builds of clang with UbSan/ASan configurations.

llvm-svn: 342387
2018-09-17 12:24:55 +00:00
Max Kazantsev 5fe3620261 [NFC] Turn unsigned counters into boolean flags
llvm-svn: 342360
2018-09-17 06:33:29 +00:00
Craig Topper 2da7381678 [InstCombine] Support (sub (sext x), (sext y)) --> (sext (sub x, y)) and (sub (zext x), (zext y)) --> (zext (sub x, y))
Summary:
If the sub doesn't overflow in the original type we can move it above the sext/zext.

This is similar to what we do for add. The overflow checking for sub is currently weaker than add, so the test cases are constructed for what is supported.

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52075

llvm-svn: 342335
2018-09-15 18:54:10 +00:00
Sanjay Patel 296d35a5e9 [InstCombine][x86] try harder to convert blendv intrinsic to generic IR (PR38814)
Missing optimizations with blendv are shown in:
https://bugs.llvm.org/show_bug.cgi?id=38814

If this works, it's an easier and more powerful solution than adding pattern matching 
for a few special cases in the backend. The potential danger with this transform in IR
is that the condition value can get separated from the select, and the backend might 
not be able to make a blendv out of it again. I don't think that's too likely, but 
I've kept this patch minimal with a 'TODO', so we can test that theory in the wild 
before expanding the transform.

Differential Revision: https://reviews.llvm.org/D52059

llvm-svn: 342324
2018-09-15 14:25:44 +00:00
Roman Lebedev 1b7fc87020 [InstCombine] Inefficient pattern for high-bits checking 3 (PR38708)
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only n bits wide (where n is a variable.)
There are many ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

The last (as far i know?) pattern, non-canonical due to the extra use.
https://godbolt.org/z/aCMsPk
https://rise4fun.com/Alive/I6f

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52062

llvm-svn: 342321
2018-09-15 12:04:13 +00:00
Sanjay Patel 90a36346bc [InstCombine] refactor mul narrowing folds; NFCI
Similar to rL342278:
The test diffs are all cosmetic due to the change in
value naming, but I'm including that to show that the
new code does perform these folds rather than something
else in instcombine.

D52075 should be able to use this code too rather than
duplicating all of the logic.

llvm-svn: 342292
2018-09-14 22:23:35 +00:00
Sanjay Patel 46945b9e9d [InstCombine] add/use overflowing math helper functions; NFC
The mul case can already be refactored to use this similar to
rL342278.
The sub case is proposed in D52075.

llvm-svn: 342289
2018-09-14 21:30:07 +00:00
Wei Mi 6a14325dff [SampleFDO] Add FunctionOffsetTable in compact binary format profile.
The patch saves a function offset table which maps function name index to the
offset of its function profile to the start of the binary profile. By using
the function offset table, for those function profiles which will not be used
when compiling a module, the profile reader does't have to read them. For
profile size around 10~20M, it saves ~10% compile time.

Differential Revision: https://reviews.llvm.org/D51863

llvm-svn: 342283
2018-09-14 20:52:59 +00:00
Sanjay Patel 2426eb46dd [InstCombine] refactor add narrowing folds; NFCI
The test diffs are all cosmetic due to the change in
value naming, but I'm including that to show that the
new code does perform these folds rather than something
else in instcombine.

llvm-svn: 342278
2018-09-14 20:40:46 +00:00
Sebastian Pop 0f30f08b02 HotColdSplit: fix invalid SSA due to outlining
The test used to fail with an invalid phi node: the two predecessors were outlined
and the SSA representation was left invalid. The patch adds the exit block to the
cold region.

llvm-svn: 342277
2018-09-14 20:36:19 +00:00
Sebastian Pop 3abcf69074 HotColdSplit: fix isSingleEntrySingleExit
remove duplicate entries from isSingleEntrySingleExit: the Entry block is
already added by the loop over the dominance frontier.

Remove the heuristic from isOutlineCandidate that a region is too small when it
only contains a basic block. With this change we now grow regions starting from
a block and we continue adding to the ValidColdRegion. Check the heuristic just
before code generation.

llvm-svn: 342276
2018-09-14 20:36:14 +00:00
Sebastian Pop 1217160bb3 HotColdSplit: add back propagation to extend cold regions
Also fix a problem in forward propagation:
  const TerminatorInst *TI = It->getTerminator();
was set outside the while loop that iterates over It.

llvm-svn: 342275
2018-09-14 20:36:10 +00:00
Florian Hahn 3afb974aa5 [LoopInterchange] Preserve ScalarEvolution, by forgetting about interchanged loops.
As preparation for LoopInterchange becoming a loop pass, it needs to
preserve ScalarEvolution. Even though interchanging should not change
the trip count of the loop, it modifies loop entry, latch and exit
blocks.

I added -verify-scev to some loop interchange tests, but the verification does
not catch problems caused by missing invalidation of SE in loop interchange, as
the trip counts themselves do not change. So there might be potential to
make the SE verification covering more stuff in the future.

Reviewers: mkazantsev, efriedma, karthikthecool

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D52026

llvm-svn: 342209
2018-09-14 07:50:20 +00:00
Max Kazantsev e9765ac275 [NFC] Remove meaningless code from GVN
llvm-svn: 342202
2018-09-14 04:50:38 +00:00
Hideki Saito d19851ac7e Fix for the buildbot failure http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/23635
from the commit (r342197) of https://reviews.llvm.org/D50820.

llvm-svn: 342201
2018-09-14 02:02:57 +00:00
Hideki Saito ea7f3035a0 [VPlan] Implement initial vector code generation support for simple outer loops.
Summary:
[VPlan] Implement vector code generation support for simple outer loops.

Context: Patch Series #1 for outer loop vectorization support in LV  using VPlan. (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).
                                                          
This patch introduces vector code generation support for simple outer loops that are currently supported in the VPlanNativePath. Changes here essentially do the following:

  - force vector code generation using explicit vectorize_width

  - add conservative early returns in cost model and other places for VPlanNativePath

  - add code for setting up outer loop inductions 

  - support for widening non-induction PHIs that can result from inner loops and uniform conditional branches

  - support for generating uniform inner branches

We plan to add a handful C outer loop executable tests once the initial code generation support is committed. This patch is expected to be NFC for the inner loop vectorizer path. Since we are moving in the direction of supporting outer loop vectorization in LV, it may also be time to rename classes such as InnerLoopVectorizer. 

Reviewers: fhahn, rengolin, hsaito, dcaballe, mkuper, hfinkel, Ayal

Reviewed By: fhahn, hsaito

Subscribers: dmgreen, bollu, tschuett, rkruppe, rogfer01, llvm-commits

Differential Revision: https://reviews.llvm.org/D50820

llvm-svn: 342197
2018-09-14 00:36:00 +00:00
Matt Morehouse 3bea25e554 [SanitizerCoverage] Create comdat for global arrays.
Summary:
Place global arrays in comdat sections with their associated functions.
This makes sure they are stripped along with the functions they
reference, even on the BFD linker.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51902

llvm-svn: 342186
2018-09-13 21:45:55 +00:00
Roman Lebedev 6dc87004fa [InstCombine] Inefficient pattern for high-bits checking 2 (PR38708)
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only n bits wide (where n is a variable.)
There are many ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

More complicated, canonical pattern:
https://rise4fun.com/Alive/uhA

We do need to have two `switch()`'es like this,
to not mismatch the swappable predicates.

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52001

llvm-svn: 342173
2018-09-13 20:33:12 +00:00
George Burgess IV d565b0f017 [PartiallyInlineLibCalls] Add DebugCounter support
This adds DebugCounter support to the PartiallyInlineLibCalls pass,
which should make debugging/automated bisection easier in the future.

Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D50093

llvm-svn: 342172
2018-09-13 20:33:04 +00:00
George Burgess IV 7a8dc3dafa [DCE] Add DebugCounter support
Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D50092

llvm-svn: 342170
2018-09-13 20:29:50 +00:00
Craig Topper 8fc05ce340 [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible.
This allows the xor to be removed completely.

This might help with recomitting r341674, but seems good regardless.

Coincidentally fixes PR38915.

Differential Revision: https://reviews.llvm.org/D51964

llvm-svn: 342163
2018-09-13 18:52:58 +00:00
Sanjay Patel 37e464876b [InstCombine] remove checks for IsFreeToInvert()
I accidentally committed this diff with rL342147 because
I had applied D51964. We probably do need those checks,
but D51964 has tests and more discussion/motivation,
so they should be re-added with that patch.

llvm-svn: 342149
2018-09-13 16:18:12 +00:00
Sanjay Patel 6f00fc3317 [InstCombine] reorder folds to reduce chance of infinite loops
I don't have a test case for this, but it's motivated by
the discussion in D51964, and I've added TODO comments for
the better fix - move simplifications into instsimplify
because that's more efficient and reduces risk of infinite
loops in instcombine caused by transforms trying to do the
opposite folds.

In this case, we know that the transform that tries to move
'not' through min/max can be fooled by the multiple uses
of a value in another min/max, so try to squash the 
foldSPFofSPF() patterns first.

llvm-svn: 342147
2018-09-13 16:04:06 +00:00
Sanjay Patel d341988c86 revert r341288 - [Reassociate] swap binop operands to increase factoring potential
This causes or exposes indeterminism that is visible in the output of -reassociate.

llvm-svn: 342083
2018-09-12 21:29:11 +00:00
Roman Lebedev 75404fb9f8 [InstCombine] Inefficient pattern for high-bits checking (PR38708)
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only `n` bits wide (where `n` is a variable.)
There are **many** ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

Let's handle the second variant first, since it is much simpler.
https://rise4fun.com/Alive/LYjY

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51985

llvm-svn: 342067
2018-09-12 18:19:43 +00:00
Alexandros Lamprineas 54d56f274c [GVNHoist] computeInsertionPoints() miscalculates IDF
Fix for https://bugs.llvm.org/show_bug.cgi?id=38912.

In GVNHoist::computeInsertionPoints() we iterate over the Value
Numbers and calculate the Iterated Dominance Frontiers without
clearing the IDFBlocks vector first. IDFBlocks ends up accumulating
an insane number of basic blocks, which bloats the compilation time
of SemaChecking.cpp with ubsan enabled.

Differential Revision: https://reviews.llvm.org/D51980

llvm-svn: 342055
2018-09-12 14:28:23 +00:00
David Green 2352b30c96 [SimplifyCFG] Put an alignment on generated switch tables
Previously the alignment on the newly created switch table data was not set,
meaning that DataLayout::getPreferredAlignment was free to overalign it to 16
bytes. This causes unnecessary code bloat.

Differential Revision: https://reviews.llvm.org/D51800

llvm-svn: 342039
2018-09-12 09:54:17 +00:00
Florian Hahn 1086ce2397 [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC)
Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use
them for VPlan outside LoopVectorize.cpp

Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00

Reviewed By: rengolin, xbolva00

Differential Revision: https://reviews.llvm.org/D49488

llvm-svn: 342027
2018-09-12 08:01:57 +00:00