Commit Graph

22597 Commits

Author SHA1 Message Date
Johannes Doerfert 766f2cc1a4 [Attributor] Use local linkage instead of internal
Local linkage is internal or private, and private is a specialization of
internal, so either is fine for all our "local linkage" queries.

llvm-svn: 373986
2019-10-07 23:21:52 +00:00
Johannes Doerfert 661db04b98 [Attributor] Use abstract call sites for call site callback
Summary:
When we iterate over uses of functions and expect them to be call sites,
we now use abstract call sites to allow callback calls.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, hfinkel, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67871

llvm-svn: 373985
2019-10-07 23:14:58 +00:00
Johannes Doerfert 1097fab1cf [Attributor] Deduce memory behavior of functions and arguments
Deduce the memory behavior, aka "read-none", "read-only", or
"write-only", for functions and arguments.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67384

llvm-svn: 373965
2019-10-07 21:07:57 +00:00
Roman Lebedev 7cdeac43e5 [InstCombine] Fold conditional sign-extend of high-bit-extract into high-bit-extract-with-signext (PR42389)
This can come up in Bit Stream abstractions.

The pattern looks big/scary, but it can't be simplified any further.
It only is so simple because a number of my preparatory folds had
happened already (shift amount reassociation / shift amount
reassociation in bit test, sign bit test detection).

Highlights:
* There are two main flavors: https://rise4fun.com/Alive/zWi
  The difference is add vs. sub, and left-shift of -1 vs. 1
* Since we only change the shift opcode,
  we can preserve the exact-ness: https://rise4fun.com/Alive/4u4
* There can be truncation after high-bit-extraction:
  https://rise4fun.com/Alive/slHc1   (the main pattern i'm after!)
  Which means that we need to ignore zext of shift amounts and of NBits.
* The sign-extending magic can be extended itself (in add pattern
  via sext, in sub pattern via zext. not the other way around!)
  https://rise4fun.com/Alive/NhG
  (or those sext/zext can be sinked into `select`!)
  Which again means we should pay attention when matching NBits.
* We can have both truncation of extraction and widening of magic:
  https://rise4fun.com/Alive/XTw
  In other words, i don't believe we need to have any checks on
  bitwidths of any of these constructs.

This is worsened in general by the fact that we may have `sext` instead
of `zext` for shift amounts, and we don't yet canonicalize to `zext`,
although we should. I have not done anything about that here.

Also, we really should have something to weed out `sub` like these,
by folding them into `add` variant.

https://bugs.llvm.org/show_bug.cgi?id=42389

llvm-svn: 373964
2019-10-07 20:53:27 +00:00
Roman Lebedev 0c73be590e [InstCombine] Move isSignBitCheck(), handle rest of the predicates
True, no test coverage is being added here. But those non-canonical
predicates that are already handled here already have no test coverage
as far as i can tell. I tried to add tests for them, but all the patterns
already get handled elsewhere.

llvm-svn: 373962
2019-10-07 20:53:08 +00:00
Roman Lebedev cb6d851bb6 [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask
Summary:
Currently, we pre-check whether we need to produce a mask or not.
This involves some rather magical constants.
I'd like to extend this fold to also handle the situation
when there's also a `trunc` before outer shift.
That will require another set of magical constants.
It's ugly.

Instead, we can just compute the mask, and check
whether mask is a pass-through (all-ones) or not.
This way we don't need to have any magical numbers.

This change is NFC other than the fact that we now compute
the mask and then check if we need (and can!) apply it.

Reviewers: spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68470

llvm-svn: 373961
2019-10-07 20:53:00 +00:00
Roman Lebedev c3b394ffba [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amounts
Summary:
When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`,
which means that for patterns a/b we'd assume that we must not produce
any bits for that channel, while in reality we simply didn't care
about that channel - i.e. we don't need to mask it.

Reviewers: spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68239

llvm-svn: 373960
2019-10-07 20:52:52 +00:00
Jordan Rose fdaa742174 Second attempt to add iterator_range::empty()
Doing this makes MSVC complain that `empty(someRange)` could refer to
either C++17's std::empty or LLVM's llvm::empty, which previously we
avoided via SFINAE because std::empty is defined in terms of an empty
member rather than begin and end. So, switch callers over to the new
method as it is added.

https://reviews.llvm.org/D68439

llvm-svn: 373935
2019-10-07 18:14:24 +00:00
Martin Storsjo dfc1aee25b Revert "[SLP] avoid reduction transform on patterns that the backend can load-combine"
This reverts SVN r373833, as it caused a failed assert "Non-zero loop
cost expected" on building numerous projects, see PR43582 for details
and reproduction samples.

llvm-svn: 373882
2019-10-07 08:21:37 +00:00
Sanjay Patel aab8b3ab9c [InstCombine] fold fneg disguised as select+fmul (PR43497)
Extends rL373230 and solves the motivating bug (although in a narrow way):
https://bugs.llvm.org/show_bug.cgi?id=43497

llvm-svn: 373851
2019-10-06 14:15:48 +00:00
Sanjay Patel c38881a6b7 [InstCombine] don't assume 'inbounds' for bitcast pointer to GEP transform (PR43501)
https://bugs.llvm.org/show_bug.cgi?id=43501
We can't declare a GEP 'inbounds' in general. But we may salvage that information if
we have known dereferenceable bytes on the source pointer.

Differential Revision: https://reviews.llvm.org/D68244

llvm-svn: 373847
2019-10-06 13:08:08 +00:00
Sanjay Patel e2321bb448 [SLP] avoid reduction transform on patterns that the backend can load-combine
I don't see an ideal solution to these 2 related, potentially large, perf regressions:
https://bugs.llvm.org/show_bug.cgi?id=42708
https://bugs.llvm.org/show_bug.cgi?id=43146

We decided that load combining was unsuitable for IR because it could obscure other
optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend.
Therefore, preventing SLP from destroying load combine opportunities requires that it
recognizes patterns that could be combined later, but not do the optimization itself (
it's not a vector combine anyway, so it's probably out-of-scope for SLP).

Here, we add a scalar cost model adjustment with a conservative pattern match and cost
summation for a multi-instruction sequence that can probably be reduced later.
This should prevent SLP from creating a vector reduction unless that sequence is
extremely cheap.

In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining
will produce a single instruction on these tests like:

  movbe   rax, qword ptr [rdi]

or:

  mov     rax, qword ptr [rdi]

Not some (half) vector monstrosity as we currently do using SLP:

  vpmovzxbq       ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,..
  vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0]
  movzx   eax, byte ptr [rdi]
  movzx   ecx, byte ptr [rdi + 5]
  shl     rcx, 40
  movzx   edx, byte ptr [rdi + 6]
  shl     rdx, 48
  or      rdx, rcx
  movzx   ecx, byte ptr [rdi + 7]
  shl     rcx, 56
  or      rcx, rdx
  or      rcx, rax
  vextracti128    xmm1, ymm0, 1
  vpor    xmm0, xmm0, xmm1
  vpshufd xmm1, xmm0, 78          # xmm1 = xmm0[2,3,0,1]
  vpor    xmm0, xmm0, xmm1
  vmovq   rax, xmm0
  or      rax, rcx
  vzeroupper
  ret

Differential Revision: https://reviews.llvm.org/D67841

llvm-svn: 373833
2019-10-05 18:03:58 +00:00
Aditya Kumar 6a2673605e Invalidate assumption cache before outlining.
Subscribers: llvm-commits

Tags: #llvm

Reviewers: compnerd, vsk, sebpop, fhahn, tejohnson

Reviewed by: vsk

Differential Revision: https://reviews.llvm.org/D68478

llvm-svn: 373807
2019-10-04 22:46:42 +00:00
Roman Lebedev fb5af8b9b9 [InstCombine] Fold 'icmp eq/ne (?trunc (lshr/ashr %x, bitwidth(x)-1)), 0' -> 'icmp sge/slt %x, 0'
We do indeed already get it right in some cases, but only transitively,
with one-use restrictions. Since we only need to produce a single
comparison, it makes sense to match the pattern directly:
  https://rise4fun.com/Alive/kPg

llvm-svn: 373802
2019-10-04 22:16:22 +00:00
Roman Lebedev f304d4d185 [InstCombine] Right-shift shift amount reassociation with truncation (PR43564, PR42391)
Initially (D65380) i believed that if we have rightshift-trunc-rightshift,
we can't do any folding. But as it usually happens, i was wrong.

https://rise4fun.com/Alive/GEw
https://rise4fun.com/Alive/gN2O

In https://bugs.llvm.org/show_bug.cgi?id=43564 we happen to have
this very sequence, of two right shifts separated by trunc.
And "just" so that happens, we apparently can fold the pattern
if the total shift amount is either 0, or it's equal to the bitwidth
of the innermost widest shift - i.e. if we are left with only the
original sign bit. Which is exactly what is wanted there.

llvm-svn: 373801
2019-10-04 22:16:11 +00:00
Peter Collingbourne 71662116fd LowerTypeTests: Rename local functions to avoid collisions with identically named functions in ThinLTO modules.
Without this we can encounter link errors or incorrect behaviour
at runtime as a result of the wrong function being referenced.

Differential Revision: https://reviews.llvm.org/D67945

llvm-svn: 373678
2019-10-03 23:42:44 +00:00
Alina Sbirlea 145cdad119 [MemorySSA] Don't hoist stores if interfering uses (as calls) exist.
llvm-svn: 373674
2019-10-03 22:20:04 +00:00
Bardia Mahjour f6c34de117 [PGO] Refactor Value Profiling into a plugin based oracle and create a well defined API for the plugins.
Summary: This PR creates a utility class called ValueProfileCollector that tells PGOInstrumentationGen and PGOInstrumentationUse what to value-profile and where to attach the profile metadata. It then refactors logic scattered in PGOInstrumentation.cpp into two plugins that plug into the ValueProfileCollector.

Authored By: Wael Yehia <wyehia@ca.ibm.com>

Reviewer: davidxl, tejohnson, xur

Reviewed By: davidxl, tejohnson, xur

Subscribers: llvm-commits

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D67920

Patch By Wael Yehia <wyehia@ca.ibm.com>

llvm-svn: 373601
2019-10-03 14:20:50 +00:00
Guillaume Chatelet d400d45150 [Alignment][NFC] Remove StoreInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu, jdoerfert

Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68268

llvm-svn: 373595
2019-10-03 13:17:21 +00:00
Roman Lebedev ae3315af07 [InstCombine] Bypass high bit extract before variable sign-extension (PR43523)
https://rise4fun.com/Alive/8BY - valid for lshr+trunc+variable sext
https://rise4fun.com/Alive/7jk - the variable sext can be redundant

https://rise4fun.com/Alive/Qslu - 'exact'-ness of first shift can be preserver

https://rise4fun.com/Alive/IF63 - without trunc we could view this as
                                  more general "drop redundant mask before right-shift",
                                  but let's handle it here for now
https://rise4fun.com/Alive/iip - likewise, without trunc, variable sext can be redundant.

There's more patterns for sure - e.g. we can have 'lshr' as the final shift,
but that might be best handled by some more generic transform, e.g.
"drop redundant masking before right-shift" (PR42456)

I'm singling-out this sext patch because you can only extract
high bits with `*shr` (unlike abstract bit masking),
and i *know* this fold is wanted by existing code.

I don't believe there is much to review here,
so i'm gonna opt into post-review mode here.

https://bugs.llvm.org/show_bug.cgi?id=43523

llvm-svn: 373542
2019-10-02 23:02:12 +00:00
David Bolvansky 6b45029676 [InstCombine] Transform bcopy to memmove
bcopy is still widely used mainly for network apps. Sadly, LLVM has no optimizations for bcopy, but there are some for memmove. 
Since bcopy == memmove, it is profitable to transform bcopy to memmove and use current optimizations for memmove for free here.

llvm-svn: 373537
2019-10-02 22:49:20 +00:00
Florian Hahn a03d7b0f24 Recommit "[GlobalOpt] Pass DTU to removeUnreachableBlocks instead of recomputing."
The cause for the revert should be fixed by r373513 /
a80b6c1542

This reverts commit 47dbcbd8ec.

llvm-svn: 373522
2019-10-02 20:40:13 +00:00
Evgeniy Stepanov 464df87288 Handle llvm.launder.invariant.group in msan.
Summary:
[MSan] handle llvm.launder.invariant.group

    Msan used to give false-positives in

    class Foo {
     public:
      virtual ~Foo() {};
    };

    // Return true iff *x is set.
    bool f1(void **x, bool flag);

    Foo* f() {
      void *p;
      bool found;
      found = f1(&p,flag);
      if (found) {
        // p is always set here.
        return static_cast<Foo*>(p); // False positive here.
      }
      return nullptr;
    }

Patch by Ilya Tokar.

Reviewers: #sanitizers, eugenis

Reviewed By: #sanitizers, eugenis

Subscribers: eugenis, Prazek, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68236

llvm-svn: 373515
2019-10-02 19:53:19 +00:00
Florian Hahn a80b6c1542 [Local] Handle terminators with users in removeUnreachableBlocks.
Terminators like invoke can have users outside the current basic block.
We have to replace those users with undef, before replacing the
terminator.

This fixes a crash exposed by rL373430.

Reviewers: brzycki, asbirlea, davide, spatel

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D68327

llvm-svn: 373513
2019-10-02 19:38:24 +00:00
Florian Hahn eb6700b57e [Local] Remove unused LazyValueInfo pointer from removeUnreachableBlock.
There are no users that pass in LazyValueInfo, so we can simplify the
function a bit.

Reviewers: brzycki, asbirlea, davide

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D68297

llvm-svn: 373488
2019-10-02 16:58:13 +00:00
Teresa Johnson 077cc3fcb0 [ThinLTO/WPD] Ensure devirtualized targets use promoted symbol when necessary
Summary:
This fixes a hole in the handling of devirtualized targets that were
local but need promoting due to devirtualization in another module. We
were not correctly referencing the promoted symbol in some cases. Make
sure the code that updates the name also looks at the ExportedGUIDs set
by utilizing a callback that checks all conditions (the callback
utilized by the internalization/promotion code).

Reviewers: pcc, davidxl, hiraditya

Subscribers: mehdi_amini, Prazek, inglorion, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68159

llvm-svn: 373485
2019-10-02 16:36:59 +00:00
Simon Pilgrim 91b4085b03 LowerExpectIntrinsic handlePhiDef - silence static analyzer dyn_cast<PHINode> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<PHINode> directly and if not assert will fire for us.

llvm-svn: 373481
2019-10-02 16:03:45 +00:00
Aditya Kumar c4a7b912c2 [CodeExtractor] NFC: Refactor sanity checks into isEligible
Reviewers: fhahn

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68331

llvm-svn: 373479
2019-10-02 15:36:39 +00:00
Aditya Kumar b1fe6c90e6 NFC: directly return when CommonExitBlock != Succ
Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68330

llvm-svn: 373456
2019-10-02 12:15:17 +00:00
Simon Pilgrim da4cbae696 LICM - remove unused variable and reduce scope of another variable. NFCI.
Appeases both clang static analyzer and cppcheck

llvm-svn: 373453
2019-10-02 11:49:53 +00:00
Florian Hahn 47dbcbd8ec Revert [GlobalOpt] Pass DTU to removeUnreachableBlocks instead of recomputing.
This breaks http://lab.llvm.org:8011/builders/sanitizer-windows/builds/52310

This reverts r373430 (git commit 70f7003548)

llvm-svn: 373432
2019-10-02 08:32:25 +00:00
Florian Hahn 70f7003548 [GlobalOpt] Pass DTU to removeUnreachableBlocks instead of recomputing.
removeUnreachableBlocks knows how to preserve the DomTree, so make use
of it instead of re-computing the DT.

Reviewers: davide, kuhar, brzycki

Reviewed By: davide, kuhar

Differential Revision: https://reviews.llvm.org/D68298

llvm-svn: 373430
2019-10-02 08:15:31 +00:00
Florian Hahn 167b0529be [Local] Simplify function removeUnreachableBlocks() to avoid (re-)computation.
Two small changes in llvm::removeUnreachableBlocks() to avoid unnecessary (re-)computation.

First, replace the use of count() with find(), which has better time complexity.

Second, because we have already computed the set of dead blocks, replace the second loop over all basic blocks to a loop only over the already computed dead blocks. This simplifies the loop and avoids recomputation.

Patch by Rodrigo Caetano Rocha <rcor.cs@gmail.com>

Reviewers: efriedma, spatel, fhahn, xbolva00

Reviewed By: fhahn, xbolva00

Differential Revision: https://reviews.llvm.org/D68191

llvm-svn: 373429
2019-10-02 07:37:41 +00:00
Sanjay Patel 9738fd6387 [BypassSlowDivision][CodeGenPrepare] avoid crashing on unused code (PR43514)
https://bugs.llvm.org/show_bug.cgi?id=43514

llvm-svn: 373394
2019-10-01 21:25:36 +00:00
Leonard Chan 8830975cf6 [ASan][NFC] Address remaining comments for https://reviews.llvm.org/D68287
I submitted that patch after I got the LGTM, but the comments didn't
appear until after I submitted the change. This adds `const` to the
constructor argument and makes it a pointer.

llvm-svn: 373391
2019-10-01 20:49:07 +00:00
Leonard Chan 63663616f5 [ASan] Make GlobalsMD member a const reference.
PR42924 points out that copying the GlobalsMetadata type during
construction of AddressSanitizer can result in exteremely lengthened
build times for translation units that have many globals. This can be addressed
by just making the GlobalsMD member in AddressSanitizer a reference to
avoid the copy. The GlobalsMetadata type is already passed to the
constructor as a reference anyway.

Differential Revision: https://reviews.llvm.org/D68287

llvm-svn: 373389
2019-10-01 20:30:46 +00:00
Roman Lebedev 053014f8f9 [InstCombine] Deal with -(trunc(X >>u 63)) -> trunc(X >>s 63)
Identical to it's trunc-less variant, just pretent-to hoist
trunc, and everything else still holds:
https://rise4fun.com/Alive/JRU

llvm-svn: 373364
2019-10-01 17:50:20 +00:00
Roman Lebedev 65144149d0 [InstCombine] Preserve 'exact' in -(X >>u 31) -> (X >>s 31) fold
https://rise4fun.com/Alive/yR4

llvm-svn: 373363
2019-10-01 17:50:09 +00:00
Philip Reames 0200626f0b [IndVars] An implementation of loop predication without a need for speculation
This patch implements a variation of a well known techniques for JIT compilers - we have an implementation in tree as LoopPredication - but with an interesting twist. This version does not assume the ability to execute a path which wasn't taken in the original program (such as a guard or widenable.condition intrinsic). The benefit is that this works for arbitrary IR from any frontend (including C/C++/Fortran). The tradeoff is that it's restricted to read only loops without implicit exits.

This builds on SCEV, and can thus eliminate the loop varying portion of the any early exit where all exits are understandable by SCEV. A key advantage is that fixing deficiency exposed in SCEV - already found one while writing test cases - will also benefit all of full redundancy elimination (and most other loop transforms).

I haven't seen anything in the literature which quite matches this. Given that, I'm not entirely sure that keeping the name "loop predication" is helpful. Anyone have suggestions for a better name? This is analogous to partial redundancy elimination - since we remove the condition flowing around the backedge - and has some parallels to our existing transforms which try to make conditions invariant in loops.

Factoring wise, I chose to put this in IndVarSimplify since it's a generally applicable to all workloads. I could split this off into it's own pass, but we'd then probably want to add that new pass every place we use IndVars.  One solid argument for splitting it off into it's own pass is that this transform is "too good". It breaks a huge number of existing IndVars test cases as they tend to be simple read only loops.  At the moment, I've opted it off by default, but if we add this to IndVars and enable, we'll have to update around 20 test files to add side effects or disable this transform.

Near term plan is to fuzz this extensively while off by default, reflect and discuss on the factoring issue mentioned just above, and then enable by default.  I also need to give some though to supporting widenable conditions in this framing.

Differential Revision: https://reviews.llvm.org/D67408

llvm-svn: 373351
2019-10-01 17:03:44 +00:00
David Bolvansky 4037582d6b Revert [InstCombine] sprintf(dest, "%s", str) -> memccpy(dest, str, 0, MAX)
Seems to be slower than memcpy + strlen.

llvm-svn: 373335
2019-10-01 13:19:04 +00:00
David Bolvansky 8fc6a1bf56 [InstCombine] sprintf(dest, "%s", str) -> memccpy(dest, str, 0, MAX)
llvm-svn: 373333
2019-10-01 13:03:10 +00:00
Evandro Menezes 41ead4281f [SimplifyLibCalls] Define the value of the Euler number
This patch fixes the build break on Windows hosts.

There must be a better way of accessing the equivalent POSIX math constant
`M_E`.

llvm-svn: 373274
2019-09-30 23:21:02 +00:00
Evandro Menezes 110b1138ba [InstCombine] Expand the simplification of log()
Expand the simplification of special cases of `log()` to include `log2()`
and `log10()` as well as intrinsics and more types.

Differential revision: https://reviews.llvm.org/D67199

llvm-svn: 373261
2019-09-30 20:52:21 +00:00
Alina Sbirlea 0fa07f4276 [LegacyPassManager] Deprecate the BasicBlockPass/Manager.
Summary:
The BasicBlockManager is potentially broken and should not be used.
Replace all uses of the BasicBlockPass with a FunctionBlockPass+loop on
blocks.

Reviewers: chandlerc

Subscribers: jholewinski, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68234

llvm-svn: 373254
2019-09-30 20:17:23 +00:00
David Bolvansky a05e671c7e [FunctionAttrs] Added noalias for memccpy/mempcpy arguments
llvm-svn: 373251
2019-09-30 19:43:48 +00:00
Roman Lebedev faa90eca63 [InstCombine][NFC] visitShl(): call SimplifyQuery::getWithInstruction() once
llvm-svn: 373249
2019-09-30 19:16:00 +00:00
Rong Xu 3674050087 [PGO] Don't group COMDAT variables for compiler generated profile variables in ELF
With this patch, compiler generated profile variables will have its own COMDAT
name for ELF format, which syncs the behavior with COFF. Tested with clang
PGO bootstrap. This shows a modest reduction in object sizes in ELF format.

Differential Revision: https://reviews.llvm.org/D68041

llvm-svn: 373241
2019-09-30 18:11:22 +00:00
Alina Sbirlea 8299fd9dee [EarlyCSE] Pass preserves AA.
llvm-svn: 373231
2019-09-30 17:08:40 +00:00
Sanjay Patel 712b7c2463 [InstCombine] fold negate disguised as select+mul
Name: negate if true
  %sel = select i1 %cond, i32 -1, i32 1
  %r = mul i32 %sel, %x
  =>
  %m = sub i32 0, %x
  %r = select i1 %cond, i32 %m, i32 %x

  Name: negate if false
  %sel = select i1 %cond, i32 1, i32 -1
  %r = mul i32 %sel, %x
  =>
  %m = sub i32 0, %x
  %r = select i1 %cond, i32 %x, i32 %m

https://rise4fun.com/Alive/Nlh

llvm-svn: 373230
2019-09-30 17:02:26 +00:00
Guillaume Chatelet ab11b9188d [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68141

llvm-svn: 373207
2019-09-30 13:34:44 +00:00
Guillaume Chatelet 17380227e8 [Alignment][NFC] Remove LoadInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jdoerfert

Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68142

llvm-svn: 373195
2019-09-30 09:37:05 +00:00
Aditya Kumar a6d9d31279 [LLVM-C][Ocaml] Add MergeFunctions and DCE pass
MergeFunctions and DCE pass are missing from OCaml/C-api. This patch
adds them.

Differential Revision: https://reviews.llvm.org/D65071

Reviewers: whitequark, hiraditya, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Tags: #llvm

Authored by: kren1

llvm-svn: 373170
2019-09-29 16:06:22 +00:00
Roman Lebedev d30093bb8a [DivRemPairs] Don't assert that we won't ever get expanded-form rem pairs in different BB's (PR43500)
If we happen to have the same div in two basic blocks,
and in one of those we also happen to have the rem part,
we'd match the div-rem pair, but the wrong ones.
So let's drop overly-ambiguous assert.

Fixes https://bugs.llvm.org/show_bug.cgi?id=43500

llvm-svn: 373167
2019-09-29 15:25:24 +00:00
Alexey Bataev 8b1eeafb91 [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!")
Initially SLP vectorizer replaced all going-to-be-vectorized
instructions with Undef values. It may break ScalarEvaluation and may
cause a crash.
Reworked SLP vectorizer so that it does not replace vectorized
instructions by UndefValue anymore. Instead vectorized instructions are
marked for deletion inside if BoUpSLP class and deleted upon class
destruction.

Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel

Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D29641

llvm-svn: 373166
2019-09-29 14:18:06 +00:00
Aditya Kumar 2adae76cc6 [NFC] Move hot cold splitting class to header file
Summary:  This is to facilitate unittests

Reviewers: compnerd, vsk, tejohnson, sebpop, brzycki, SirishP

Reviewed By: tejohnson

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68079

llvm-svn: 373151
2019-09-28 18:13:33 +00:00
Wei Mi f0c4e70e95 [SampleFDO] Create a separate flag profile-accurate-for-symsinlist to handle
profile symbol list.

Currently many existing users using profile-sample-accurate want to reduce
code size as much as possible. Their use cases are different from the scenario
profile symbol list tries to handle -- the major motivation of adding profile
symbol list is to get the major memory/code size saving without introduce
performance regression. So to keep the behavior of profile-sample-accurate
unchanged, we think decoupling these two things and using a new flag to
control the handling of profile symbol list may be better.

When profile-sample-accurate and the new flag profile-accurate-for-symsinlist
are both present, since profile-sample-accurate is a user assertion we let it
have a higher precedence.

Differential Revision: https://reviews.llvm.org/D68047

llvm-svn: 373133
2019-09-27 22:33:59 +00:00
Roman Lebedev 269f1bea0d [InstCombine] Simplify shift-by-sext to shift-by-zext
Summary:
This is valid for any `sext` bitwidth pair:
```
Processing /tmp/opt.ll..

----------------------------------------
  %signed = sext %y
  %r = shl %x, %signed
  ret %r
=>
  %unsigned = zext %y
  %r = shl %x, %unsigned
  ret %r
  %signed = sext %y

Done: 2016
Optimization is correct!
```

(This isn't so for funnel shifts, there it's illegal for e.g. i6->i7.)

Main motivation is the C++ semantics:
```
int shl(int a, char b) {
    return a << b;
}
```
ends as
```
  %3 = sext i8 %1 to i32
  %4 = shl i32 %0, %3
```
https://godbolt.org/z/0jgqUq
which is, as this shows, too pessimistic.

There is another problem here - we can only do the fold
if sext is one-use. But we can trivially have cases
where several shifts have the same sext shift amount.
This should be resolved, later.

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: efriedma, hiraditya, nlopes, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68103

llvm-svn: 373106
2019-09-27 18:12:15 +00:00
Simon Pilgrim 2e0de86808 ModuleUtils - silence static analyzer dyn_cast<> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us.

llvm-svn: 373099
2019-09-27 16:55:49 +00:00
Simon Pilgrim f71f23d14d FunctionImportGlobalProcessing::processGlobalForThinLTO - silence static analyzer dyn_cast<FunctionSummary> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<FunctionSummary> directly and if not assert will fire for us.

llvm-svn: 373097
2019-09-27 15:49:19 +00:00
Simon Pilgrim 623b0e6963 SCCP - silence static analyzer dyn_cast<StructType> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<StructType> directly and if not assert will fire for us.

llvm-svn: 373095
2019-09-27 15:49:10 +00:00
Guillaume Chatelet 18f805a7ea [Alignment][NFC] Remove unneeded llvm:: scoping on Align types
llvm-svn: 373081
2019-09-27 12:54:21 +00:00
Guillaume Chatelet d886f391af [Alignment][NFC] MaybeAlign in GVNExpression
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67922

llvm-svn: 373054
2019-09-27 08:56:43 +00:00
Peter Collingbourne c336557f02 hwasan: Compatibility fixes for short granules.
We can't use short granules with stack instrumentation when targeting older
API levels because the rest of the system won't understand the short granule
tags stored in shadow memory.

Moreover, we need to be able to let old binaries (which won't understand
short granule tags) run on a new system that supports short granule
tags. Such binaries will call the __hwasan_tag_mismatch function when their
outlined checks fail. We can compensate for the binary's lack of support
for short granules by implementing the short granule part of the check in
the __hwasan_tag_mismatch function. Unfortunately we can't do anything about
inline checks, but I don't believe that we can generate these by default on
aarch64, nor did we do so when the ABI was fixed.

A new function, __hwasan_tag_mismatch_v2, is introduced that lets code
targeting the new runtime avoid redoing the short granule check. Because tag
mismatches are rare this isn't important from a performance perspective; the
main benefit is that it introduces a symbol dependency that prevents binaries
targeting the new runtime from running on older (i.e. incompatible) runtimes.

Differential Revision: https://reviews.llvm.org/D68059

llvm-svn: 373035
2019-09-27 01:02:10 +00:00
Jordan Rupprecht f98d2c099a Revert [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!")
This reverts r372626 (git commit 6a278d9073)

llvm-svn: 373019
2019-09-26 22:09:17 +00:00
Kit Barton 50bc610460 [LoopFusion] Add ability to fuse guarded loops
Summary:
This patch extends the current capabilities in loop fusion to fuse guarded loops
(as defined in https://reviews.llvm.org/D63885). The patch adds the necessary
safety checks to ensure that it safe to fuse the guarded loops (control flow
equivalent, no intervening code, and same guard conditions). It also provides an
alternative method to perform the actual fusion of guarded loops. The mechanics
to fuse guarded loops are slightly different then fusing non-guarded loops, so I
opted to keep them separate methods. I will be cleaning this up in later
patches, and hope to converge on a single method to fuse both guarded and
non-guarded loops, but for now I think the review will be easier to keep them
separate.

Reviewers: jdoerfert, Meinersbur, dmgreen, etiotto, Whitney

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65464

llvm-svn: 373018
2019-09-26 21:42:45 +00:00
Zhaoshi Zheng 1128fa0924 [Unroll] Do NOT unroll a loop with small runtime upperbound
For a runtime loop if we can compute its trip count upperbound:

Don't unroll if:
1. loop is not guaranteed to run either zero or upperbound iterations; and
2. trip count upperbound is less than UnrollMaxUpperBound
Unless user or TTI asked to do so.

If unrolling, limit unroll factor to loop's trip count upperbound.

Differential Revision: https://reviews.llvm.org/D62989

Change-Id: I6083c46a9d98b2e22cd855e60523fdc5a4929c73
llvm-svn: 373017
2019-09-26 21:40:27 +00:00
Craig Topper 46721bb7f5 [InstCombine] Use m_Zero instead of isNullValue() when checking if a GEP index is all zeroes to prevent an infinite loop.
The test case here previously infinite looped. Only one element from the GEP is used so SimplifyDemandedVectorElts would replace the other lanes in each index with undef leading to the first index being <0, undef, undef, undef>. But there's a GEP transform that tries to replace an index into a 0 sized type with a zero index. But the zero index check only works on ConstantInt 0 or ConstantAggregateZero so it would turn the index back to zeroinitializer. Resulting in a loop.

The fix is to use m_Zero() to allow a vector of zeroes and undefs.

Differential Revision: https://reviews.llvm.org/D67977

llvm-svn: 373000
2019-09-26 17:20:50 +00:00
Jakub Kuderski d98cb81cd1 Handle successor's PHI node correctly when flattening CFG merges two if-regions
Summary:
FlattenCFG merges two 'if' basicblocks by inserting one basicblock
to another basicblock. The inserted basicblock can have a successor
that contains a PHI node whoes incoming basicblock is the inserted
basicblock. Since the existing code does not handle it, it becomes
a badref.

if (cond1)
  statement
if (cond2)
  statement
successor - contains PHI node whose predecessor is cond2

-->
if (cond1 || cond2)
  statement
(BB for cond2 was deleted)
successor - contains PHI node whose predecessor is cond2 --> bad ref!

Author: Jaebaek Seo

Reviewers: asbirlea, kuhar, tstellar, chandlerc, davide, dexonsmith

Reviewed By: kuhar

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68032

llvm-svn: 372989
2019-09-26 15:20:17 +00:00
Simon Pilgrim c15cd009ac [FlattenCFG] Silence static analyzer dyn_cast<BranchInst> null dereference warnings. NFCI.
The static analyzer is warning about a potential null dereferences, but we should be able to use cast<BranchInst> directly and if not assert will fire for us.

llvm-svn: 372977
2019-09-26 13:33:15 +00:00
Bjorn Pettersson 163c54d288 [InstCombine] Don't assume CmpInst has been visited in getFlippedStrictnessPredicateAndConstant
Summary:
Removing an assumption (assert) that the CmpInst already has been
simplified in getFlippedStrictnessPredicateAndConstant. Solution is
to simply bail out instead of hitting the assertion. Instead we
assume that any profitable rewrite will happen in the next iteration
of InstCombine.

The reason why we can't assume that the CmpInst already has been
simplified is that the worklist does not guarantee such an ordering.

Solves https://bugs.llvm.org/show_bug.cgi?id=43376

Reviewers: spatel, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68022

llvm-svn: 372972
2019-09-26 12:16:01 +00:00
Simon Pilgrim 6b794dfd3d MemorySanitizer - silence static analyzer dyn_cast<> null dereference warnings. NFCI.
The static analyzer is warning about a potential null dereferences, but we should be able to use cast<> directly and if not assert will fire for us.

llvm-svn: 372960
2019-09-26 10:56:14 +00:00
Simon Pilgrim faa5b39e4e PGOMemOPSizeOpt - silence static analyzer dyn_cast<MemIntrinsic> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<MemIntrinsic> directly and if not assert will fire for us.

llvm-svn: 372959
2019-09-26 10:56:07 +00:00
Roman Lebedev a2fa03af3a [InstCombine] foldUnsignedUnderflowCheck(): one last pattern with 'sub' (PR43251)
https://rise4fun.com/Alive/0j9

llvm-svn: 372930
2019-09-25 22:59:59 +00:00
Eli Friedman 69dddfe268 [LICM] Don't verify domtree/loopinfo unless EXPENSIVE_CHECKS is enabled.
For large functions, verifying the whole function after each loop takes
non-linear time.

Differential Revision: https://reviews.llvm.org/D67571

llvm-svn: 372924
2019-09-25 22:35:47 +00:00
Roman Lebedev 23646952e2 [InstCombine] Fold (A - B) u>=/u< A --> B u>/u<= A iff B != 0
https://rise4fun.com/Alive/KtL

This also shows that the fold added in D67412 / r372257
was too specific, and the new fold allows those test cases
to be handled more generically, therefore i delete now-dead code.

This is yet again motivated by
D67122 "[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour"

llvm-svn: 372912
2019-09-25 19:06:40 +00:00
Florian Hahn f3ab99dcf8 [InstCombine] Limit FMul constant folding for fma simplifications.
As @reames pointed out post-commit, rL371518 adds additional rounding
in some cases, when doing constant folding of the multiplication.
This breaks a guarantee llvm.fma makes and must be avoided.

This patch reapplies rL371518, but splits off the simplifications not
requiring rounding from SimplifFMulInst as SimplifyFMAFMul.

Reviewers: spatel, lebedev.ri, reames, scanon

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D67434

llvm-svn: 372899
2019-09-25 17:03:20 +00:00
Florian Hahn 5c3bc3c930 [PatternMatch] Make m_Br more flexible, add matchers for BB values.
Currently m_Br only takes references to BasicBlock*, which limits its
flexibility. For example, you have to declare a variable, even if you
ignore the result or you have to have additional checks to make sure the
matched BB matches an expected one.

This patch adds m_BasicBlock and m_SpecificBB matchers, which can be
used like the existing matchers for constants or values.

I also had a look at the existing uses and updated a few. IMO it makes
the code a bit more explicit.

Reviewers: spatel, craig.topper, RKSimon, majnemer, lebedev.ri

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D68013

llvm-svn: 372885
2019-09-25 15:05:08 +00:00
Philip Reames d9629b88ff [GCRelocate] Add a peephole to canonicalize base pointer relocation
If we generate the gc.relocate, and then later prove two arguments to the statepoint are equivalent, we should canonicalize the gc.relocate to the form we would have produced if this had been known before rewriting.

llvm-svn: 372771
2019-09-24 17:24:16 +00:00
Roman Lebedev 45fd1e9d50 [InstCombine] (a+b) < a && (a+b) != 0 -> (0-b) < a iff a/b != 0 (PR43259)
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.

For
```
#include <cassert>
char* test(char& base, signed long offset) {
  __builtin_assume(offset < 0);
  return &base + offset;
}
```
We produce

https://godbolt.org/z/r40U47

and again those two icmp's can be merged:
```
Name: 0
Pre: C != 0
  %adjusted = add i8 %base, C
  %not_null = icmp ne i8 %adjusted, 0
  %no_underflow = icmp ult i8 %adjusted, %base
  %r = and i1 %not_null, %no_underflow
=>
  %neg_offset = sub i8 0, C
  %r = icmp ugt i8 %base, %neg_offset
```
https://rise4fun.com/Alive/ALap
https://rise4fun.com/Alive/slnN

There are 3 other variants of this pattern,
i believe they all will go into InstSimplify.

https://bugs.llvm.org/show_bug.cgi?id=43259

Reviewers: spatel, xbolva00, nikic

Reviewed By: spatel

Subscribers: efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67849

llvm-svn: 372768
2019-09-24 16:10:50 +00:00
Roman Lebedev 5b881f356c [InstCombine] (a+b) <= a && (a+b) != 0 -> (0-b) < a (PR43259)
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.

This pattern isn't exactly what we get there
(strict vs. non-strict predicate), but this pattern does not
require known-bits analysis, so it is best to handle it first.

```
Name: 0
  %adjusted = add i8 %base, %offset
  %not_null = icmp ne i8 %adjusted, 0
  %no_underflow = icmp ule i8 %adjusted, %base
  %r = and i1 %not_null, %no_underflow
=>
  %neg_offset = sub i8 0, %offset
  %r = icmp ugt i8 %base, %neg_offset
```
https://rise4fun.com/Alive/knp

There are 3 other variants of this pattern,
they all will go into InstSimplify:
https://rise4fun.com/Alive/bIDZ

https://bugs.llvm.org/show_bug.cgi?id=43259

Reviewers: spatel, xbolva00, nikic

Reviewed By: spatel

Subscribers: hiraditya, majnemer, vsk, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67846

llvm-svn: 372767
2019-09-24 16:10:38 +00:00
Simon Pilgrim 934f18144d LoopVectorize - silence static analyzer dyn_cast<CmpInst> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<CmpInst> directly and if not assert will fire for us.

llvm-svn: 372732
2019-09-24 11:27:38 +00:00
Simon Pilgrim b6d11def37 [SimplifyCFG] FoldTwoEntryPHINode - silence static analyzer null dereference warning. NFCI.
Assert that we've found the DomBlock.

llvm-svn: 372728
2019-09-24 11:17:20 +00:00
Simon Pilgrim 9e8076b219 SimplifyCFG - silence static analyzer dyn_cast<LandingPadInst> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<LandingPadInst> directly and if not assert will fire for us.

llvm-svn: 372727
2019-09-24 11:17:13 +00:00
Simon Pilgrim bc58230e29 SimplifyCFG - silence static analyzer dyn_cast<Instruction> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<Instruction> directly and if not assert will fire for us.

llvm-svn: 372726
2019-09-24 11:17:06 +00:00
Alexey Lapshin 49f3c2b604 [Debuginfo] dbg.value points to undef value after Induction Variable Simplification.
Induction Variable Simplification pass does not update dbg.value intrinsic.

Before:

%add = add nuw nsw i32 %ArgIndex.06, 1
call void @llvm.dbg.value(metadata i32 %add, metadata !17, metadata !DIExpression())

After:

%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
call void @llvm.dbg.value(metadata i64 undef, metadata !17, metadata !DIExpression())

There should be:

%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
call void @llvm.dbg.value(metadata i64 %indvars.iv.next, metadata !17, metadata !DIExpression())

Differential Revision: https://reviews.llvm.org/D67770

llvm-svn: 372703
2019-09-24 08:47:03 +00:00
Sjoerd Meijer 0fcb3afb40 [LV] Forced vectorization with runtime checks and OptForSize
When vectorisation is forced with a pragma, we optimise for min size, and we
need to emit runtime memory checks, then allow this code growth and don't run
in an assert like we currently do.

This is the result of D65197 and D66803, and was a use-case not really
considered before. If this now happens, we emit an optimisation remark warning
about the code-size expansion, which can be avoided by not forcing
vectorisation or possibly source-code modifications.

Differential Revision: https://reviews.llvm.org/D67764

llvm-svn: 372694
2019-09-24 08:03:34 +00:00
Huihui Zhang a4dd98f2e9 [InstCombine] Fold a shifty implementation of clamp-to-allones.
Summary:
Fold
or(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X)
into
X s> Y ? -1 : X

https://rise4fun.com/Alive/d8Ab

clamp255 is a common operator in image processing, can be implemented
in a shifty way "(255 - X) >> 31 | X & 255". Fold shift into select
enables more optimization, e.g., vmin generation for ARM target.

Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon

Reviewed By: lebedev.ri

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67800

llvm-svn: 372678
2019-09-24 00:30:09 +00:00
Huihui Zhang 8952199715 [InstCombine] Fold a shifty implementation of clamp-to-zero.
Summary:
Fold
and(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X)
into
X s> Y ? X : 0

https://rise4fun.com/Alive/lFH

Fold shift into select enables more optimization,
e.g., vmax generation for ARM target.

Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon

Reviewed By: lebedev.ri

Subscribers: xbolva00, andreadb, craig.topper, RKSimon, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67799

llvm-svn: 372676
2019-09-24 00:15:03 +00:00
Saleem Abdulrasool 082f895b1a HotColdSplitting: invalidate the AssumptionCache on split
When a cold path is outlined, the value tracking in the assumption cache may be
invalidated due to the code motion.  We would previously trip an assertion in
subsequent passes (but required the passes to happen in a single run as the
assumption cache is shared across the passes).  Invalidating the cache ensures
that we get the correct information when needed with the legacy pass manager as
well.

llvm-svn: 372667
2019-09-23 22:23:01 +00:00
Wei Mi 22fd88530b [SampleFDO] Treat names in profile as not cold only when profile symbol list
is available

In rL372232, we treated names showing up in profile as not cold when
profile-sample-accurate is enabled. This caused 70k size regression in
Chrome/Android. The patch put a guard and only enable the change when
profile symbol list is available, i.e., keep the old behavior when profile
symbol list is not available.

Differential Revision: https://reviews.llvm.org/D67931

llvm-svn: 372665
2019-09-23 22:11:35 +00:00
Roman Lebedev 23aac95a32 [InstCombine] foldOrOfICmps(): Acquire SimplifyQuery with set CxtI
Extracted from https://reviews.llvm.org/D67849#inline-610377

llvm-svn: 372654
2019-09-23 20:40:47 +00:00
Roman Lebedev 595cfda059 [InstCombine] foldAndOfICmps(): Acquire SimplifyQuery with set CxtI
Extracted from https://reviews.llvm.org/D67849#inline-610377

llvm-svn: 372653
2019-09-23 20:40:40 +00:00
David Bolvansky 48db0272d6 [InstCombine] Annotate strndup calls with dereferenceable_or_null
"Implementations are free to malloc() a buffer containing either (size + 1) bytes or (strnlen(s, size) + 1) bytes. Applications should not assume that strndup() will allocate (size + 1) bytes when strlen(s) is smaller than size."

llvm-svn: 372647
2019-09-23 19:55:45 +00:00
Roman Lebedev 47e1ce4abe [IR] Add getExtendedType() to IntegerType and Type (dispatching to IntegerType or VectorType)
llvm-svn: 372638
2019-09-23 18:21:33 +00:00
Roman Lebedev 1972327d63 [InstCombine] dropRedundantMaskingOfLeftShiftInput(): improve comment
llvm-svn: 372637
2019-09-23 18:21:14 +00:00
David Bolvansky 8d52016155 [SLC] Convert some strndup calls to strdup calls
Summary:
Motivation:
- If we can fold it to strdup, we should (strndup does more things than strdup).
- Annotation mechanism. (Works for strdup well).

strdup and strndup are part of C 20 (currently posix fns), so we should optimize them.

Reviewers: efriedma, jdoerfert

Reviewed By: jdoerfert

Subscribers: lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67679

llvm-svn: 372636
2019-09-23 18:20:01 +00:00
Roman Lebedev 0a51e1f66d [InstCombine] dropRedundantMaskingOfLeftShiftInput(): pat. c/d/e with mask (PR42563)
Summary:
If we have a pattern `(x & (-1 >> maskNbits)) << shiftNbits`,
we already know (have a fold) that will drop the `& (-1 >> maskNbits)`
mask iff `(shiftNbits-maskNbits) s>= 0` (i.e. `shiftNbits u>= maskNbits`).

So even if `(shiftNbits-maskNbits) s< 0`, we can still
fold, we will just need to apply a **constant** mask afterwards:
```
Name: c, normal+mask
  %t0 = lshr i32 -1, C1
  %t1 = and i32 %t0, %x
  %r = shl i32 %t1, C2
=>
  %n0 = shl i32 %x, C2
  %n1 = i32 ((-(C2-C1))+32)
  %n2 = zext i32 %n1 to i64
  %n3 = lshr i64 -1, %n2
  %n4 = trunc i64 %n3 to i32
  %r = and i32 %n0, %n4
```
https://rise4fun.com/Alive/gslRa

Naturally, old `%masked` will have to be one-use.
This is not valid for pattern f - where "masking" is done via `ashr`.

https://bugs.llvm.org/show_bug.cgi?id=42563

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67725

llvm-svn: 372630
2019-09-23 17:04:28 +00:00
Roman Lebedev b4a1d8a84c [InstCombine] dropRedundantMaskingOfLeftShiftInput(): pat. a/b with mask (PR42563)
Summary:
And this is **finally** the interesting part of that fold!

If we have a pattern `(x & (~(-1 << maskNbits))) << shiftNbits`,
we already know (have a fold) that will drop the `& (~(-1 << maskNbits))`
mask iff `(maskNbits+shiftNbits) u>= bitwidth(x)`.
But that is actually ignorant, there's more general fold here:

In this pattern, `(maskNbits+shiftNbits)` actually correlates
with the number of low bits that will remain in the final value.
So even if `(maskNbits+shiftNbits) u< bitwidth(x)`, we can still
fold, we will just need to apply a **constant** mask afterwards:
```
Name: a, normal+mask
  %onebit = shl i32 -1, C1
  %mask = xor i32 %onebit, -1
  %masked = and i32 %mask, %x
  %r = shl i32 %masked, C2
=>
  %n0 = shl i32 %x, C2
  %n1 = add i32 C1, C2
  %n2 = zext i32 %n1 to i64
  %n3 = shl i64 -1, %n2
  %n4 = xor i64 %n3, -1
  %n5 = trunc i64 %n4 to i32
  %r = and i32 %n0, %n5
```
https://rise4fun.com/Alive/F5R

Naturally, old `%masked` will have to be one-use.
Similar fold exists for patterns c,d,e, will post patch later.

https://bugs.llvm.org/show_bug.cgi?id=42563

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67677

llvm-svn: 372629
2019-09-23 17:04:14 +00:00
Alexey Bataev 6a278d9073 [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!")
Summary:
Initially SLP vectorizer replaced all going-to-be-vectorized
instructions with Undef values. It may break ScalarEvaluation and may
cause a crash.
Reworked SLP vectorizer so that it does not replace vectorized
instructions by UndefValue anymore. Instead vectorized instructions are
marked for deletion inside if BoUpSLP class and deleted upon class
destruction.

Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel

Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D29641

llvm-svn: 372626
2019-09-23 16:25:03 +00:00
Roman Lebedev 01ac23ca62 [InstCombine] foldUnsignedUnderflowCheck(): s/Subtracted/ZeroCmpOp/
llvm-svn: 372625
2019-09-23 16:04:32 +00:00