Commit Graph

22145 Commits

Author SHA1 Message Date
Serguei Katkov de67affd00 [Loop Peeling] Introduce an option for profile based peeling disabling.
This patch adds an ability to disable profile based peeling 
causing the peeling of all iterations and as a result prohibits
further unroll/peeling attempts on that loop.

The motivation to get an ability to separate peeling usage in
pipeline where in the first part we peel only separate iterations if needed
and later in pipeline we apply the full peeling which will prohibit further peeling.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64983

llvm-svn: 367668
2019-08-02 09:32:52 +00:00
Serguei Katkov bbdcc82111 [Loop Peeling] Do not close further unroll/peel if profile based peeling was not used.
Current peeling cost model can decide to peel off not all iterations
but only some of them to eliminate conditions on phi. At the same time 
if any peeling happens the door for further unroll/peel optimizations on that
loop closes because the part of the code thinks that if peeling happened
it is profile based peeling and all iterations are peeled off.

To resolve this inconsistency the patch provides the flag which states whether
the full peeling basing on profile is enabled or not and peeling cost model
is able to modify this field like it does not PeelCount.

In a separate patch I will introduce an option to allow/disallow peeling basing
on profile.

To avoid infinite loop peeling the patch tracks the total number of peeled iteration
through llvm.loop.peeled.count loop metadata.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64972

llvm-svn: 367647
2019-08-02 04:29:23 +00:00
Stanislav Mekhanoshin 6fe00a21f2 Handle casts changing pointer size in the vectorizer
Added code to truncate or shrink offsets so that we can continue
base pointer search if size has changed along the way.

Differential Revision: https://reviews.llvm.org/D65612

llvm-svn: 367646
2019-08-02 04:03:37 +00:00
Stanislav Mekhanoshin eee9312a85 Relax load store vectorizer pointer strip checks
The previous change to fix crash in the vectorizer introduced
performance regressions. The condition to preserve pointer
address space during the search is too tight, we only need to
match the size.

Differential Revision: https://reviews.llvm.org/D65600

llvm-svn: 367624
2019-08-01 22:18:56 +00:00
Sjoerd Meijer e0dfce0723 Follow up of rL367592, fix the build
Some buildbots complained about:
error: default label in switch which covers all enumeration values

llvm-svn: 367603
2019-08-01 18:54:29 +00:00
Alina Sbirlea 3af2a69575 [SimplifyCFG] Mark missed Changed to true.
Summary:
DominatorTree is invalid after SimplifyCFG because of a missed `Changed = true` when simplifying a branch condition and removing an edge.
Resolves PR42272.

Reviewers: zhizhouy, manojgupta

Subscribers: jlebar, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65490

llvm-svn: 367596
2019-08-01 18:37:34 +00:00
Alina Sbirlea 172838df6b [MemorySSA] Set LoopSimplify to preserve MemorySSA in the NPM, if analysis exists.
Summary:
LoopSimplify is preserved in the legacy pass manager, but not in the new pass manager.
Update LoopSimplify to preserve MemorySSA conditionally when the analysis is available (same behavior as the legacy pass manager).

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65418

llvm-svn: 367594
2019-08-01 18:28:28 +00:00
Sjoerd Meijer 20b198ec5e [LV] Tail-Loop Folding
This allows folding of the scalar epilogue loop (the tail) into the main
vectorised loop body when the loop is annotated with a "vector predicate"
metadata hint. To fold the tail, instructions need to be predicated (masked),
enabling/disabling lanes for the remainder iterations.

Differential Revision: https://reviews.llvm.org/D65197

llvm-svn: 367592
2019-08-01 18:21:44 +00:00
Johannes Doerfert da4d811707 [Attributor][FIX] Indicate a missing update change
User of AAReturnedValues need to know if HasOverdefinedReturnedCalls
changed from false to true as it will impact the result of the return
value traversal (calls are not ignored anymore).

This will be tested with the tests in D59978.

llvm-svn: 367581
2019-08-01 16:21:54 +00:00
Roman Lebedev 081e990d08 [IR] Value: add replaceUsesWithIf() utility
Summary:
While there is always a `Value::replaceAllUsesWith()`,
sometimes the replacement needs to be conditional.

I have only cleaned a few cases where `replaceUsesWithIf()`
could be used, to both add test coverage,
and show that it is actually useful.

Reviewers: jdoerfert, spatel, RKSimon, craig.topper

Reviewed By: jdoerfert

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65528

llvm-svn: 367548
2019-08-01 12:32:08 +00:00
Roman Lebedev 0efeaa8162 [IR] SelectInst: add swapValues() utility
Summary:
Sometimes we need to swap true-val and false-val of a `SelectInst`.
Having a function for that is nicer than hand-writing it each time.

Reviewers: spatel, RKSimon, craig.topper, jdoerfert

Reviewed By: jdoerfert

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65520

llvm-svn: 367547
2019-08-01 12:31:35 +00:00
Philip Reames 79c27c9464 Fix a release-only build warning triggered by rL367485
llvm-svn: 367499
2019-08-01 01:16:08 +00:00
Philip Reames f8e7b53657 [IndVars, RLEV] Support rewriting exit values in loops without known exits (prep work)
This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts.  The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes.

The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop.

llvm-svn: 367485
2019-07-31 21:15:21 +00:00
Sanjay Patel 435cdecdf7 [InstCombine] canonicalize fneg before fmul/fdiv
Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it
easier to implement the transforms (and possibly other fneg transforms) in
1 place because we can always start the pattern match from fneg (either the
legacy binop or the new unop).

There's a secondary practical benefit seen in PR21914 and PR42681:
https://bugs.llvm.org/show_bug.cgi?id=21914
https://bugs.llvm.org/show_bug.cgi?id=42681
...hoisting fneg rather than sinking seems to play nicer with LICM in IR
(although this change may expose analysis holes in the other direction).

1. The instcombine test changes show the expected neutral IR diffs from
   reversing the order.

2. The reassociation tests show that we were missing an optimization
   opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says
   that all of these transforms are allowed (regardless of binop/unop
   fneg version) because:

   "For all other operations [besides copy/abs/negate/copysign], this
   standard does not specify the sign bit of a NaN result."
   In all of these transforms, we always have some other binop
   (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a
   potential intermediate NaN operand.
   (If that interpretation is wrong, then we must already have a bug in
   the existing transforms?)

3. The clang tests shouldn't exist as-is, but that's effectively a
   revert of rL367149 (the test broke with an extension of the
   pre-existing fneg canonicalization in rL367146).

Differential Revision: https://reviews.llvm.org/D65399

llvm-svn: 367447
2019-07-31 16:53:22 +00:00
Stanislav Mekhanoshin ba1e845c21 [AMDGPU] Fix for vectorizer crash with pointers of different size
When vectorizer strips pointers it can eventually end up with
pointers of two different sizes, then SCEV will crash.

Differential Revision: https://reviews.llvm.org/D65480

llvm-svn: 367443
2019-07-31 16:33:11 +00:00
Florian Hahn fa42f42858 [IPSCCP] Move callsite check to the beginning of the loop.
We have some code marks instructions with struct operands as overdefined,
but if the instruction is a call to a function with tracked arguments,
this breaks the assumption that the lattice values of all call sites
are not overdefined and will be replaced by a constant.

This also re-adds the assertion from D65222, with additionally skipping
non-callsite uses. This patch should address the cases reported in which
the assertion fired.

Fixes PR42738.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D65439

llvm-svn: 367430
2019-07-31 12:57:04 +00:00
Roman Lebedev 5e4e6b1fb1 [DivRemPairs] Fixup DNDEBUG build - variable is only used in assertion
llvm-svn: 367423
2019-07-31 12:26:37 +00:00
Roman Lebedev a686c60c45 [DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673)
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.

If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.

The original patch was committed in rL367288 but was reverted in rL367289
because it exposed pre-existing RAUW issues in internal data structures
of the pass; those now have been addressed in a previous patch.

https://bugs.llvm.org/show_bug.cgi?id=42673

Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner

Reviewed By: bogner

Subscribers: bogner, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65298

llvm-svn: 367419
2019-07-31 12:06:51 +00:00
Roman Lebedev 5f616901f5 [DivRemPairs] Avoid RAUW pitfalls (PR42823)
Summary:
`DivRemPairs` internally creates two maps:
* {sign, divident, divisor} -> div instruction
* {sign, divident, divisor} -> rem instruction
Then it iterates over rem map, and looks if there is an entry
in div map with the same key. Then depending on some internal logic
it may RAUW rem instruction with something else.

But if that rem instruction is an input to other div/rem,
then it was used as a key in these maps, so the old value (used in key)
is now dandling, because RAUW didn't update those maps.
And we can't even RAUW map keys in general, there's `ValueMap`,
but we don't have a single `Value` as key...

The bug was discovered via D65298, and the test there exists.
Now, i'm not sure how to expose this issue in trunk.
The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`,
but i guess this didn't miscompiled anything thus far?
I really don't think this is benin without that patch.

The fix is actually rather straight-forward - instead of trying to somehow
shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new
`ValueMap` with key being a struct of `Value`s, we can just have an intermediate
data structure - a vector, each entry containing matching `Div, Rem` pair,
and pre-filling it before doing any modifications.
This way we won't need to query map after doing RAUW, so no bug is possible.

Reviewers: spatel, bogner, RKSimon, craig.topper

Reviewed By: spatel

Subscribers: hiraditya, hans, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65451

llvm-svn: 367417
2019-07-31 12:06:38 +00:00
Florian Hahn 189efe295b Recommit "[GVN] Preserve loop related analysis/canonical forms."
This fixes some pipeline tests.
This reverts commit d0b6f42936.

llvm-svn: 367401
2019-07-31 09:27:54 +00:00
Florian Hahn d0b6f42936 Revert [GVN] Preserve loop related analysis/canonical forms.
This reverts r367332 (git commit 2d7227ec3a)

llvm-svn: 367335
2019-07-30 17:04:58 +00:00
Florian Hahn 2d7227ec3a [GVN] Preserve loop related analysis/canonical forms.
LoopInfo can be easily preserved by passing it to the functions that
modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor.
SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in
LoopInfo. The test case shows that we preserve LoopSimplify and
LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some
reason.

Also I am not sure if it is possible to preserve those in the new pass
manager, as they aren't analysis passes.

Reviewers: reames, hfinkel, davide, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D65137

llvm-svn: 367332
2019-07-30 16:43:39 +00:00
Kit Barton de0b633999 [LoopFusion] Extend use of OptimizationRemarkEmitter
Summary:
This patch extends the use of the OptimizationRemarkEmitter to provide
information about loops that are not fused, and loops that are not eligible for
fusion. In particular, it uses the OptimizationRemarkAnalysis to identify loops
that are not eligible for fusion and the OptimizationRemarkMissed to identify
loops that cannot be fused.

It also reuses the statistics to provide the messages used in the
OptimizationRemarks. This provides common message strings between the
optimization remarks and the statistics.

I would like feedback on this approach, in general. If people are OK with this,
I will flesh out additional remarks in subsequent commits.

Subscribers: hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63844

llvm-svn: 367327
2019-07-30 15:58:43 +00:00
Roman Lebedev be612ea471 [InstCombine] Fold "x ?% y ==/!= 0" to "x & (y-1) ==/!= 0" iff y is power-of-two
Summary:
I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling.

While we did happen to handle this fold in most of the cases,
the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two),
or first turn `x s% -y` to `x u% y`; that does handle most of the cases.
But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`,
and thus we end up being stuck with `(x s% INT_MIN) == 0`.

There is no such restriction for the more general fold:
https://rise4fun.com/Alive/IIeS

To be noted, the fold does not enforce that `y` is a constant,
so it may indeed increase instruction count.
This is consistent with what `x u% y`->`x & (y-1)` already does.
I think it makes sense, it's at most one (simple) extra instruction,
while `rem`ainder is really much more un-simple (and likely **very** costly).

Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65046

llvm-svn: 367322
2019-07-30 15:28:22 +00:00
Roman Lebedev 8e0cf076ac Revert "[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)"
test-suite/MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG broke:

Only PHI nodes may reference their own value!
  %sub33 = srem i32 %sub33, %ranks_in_i

This reverts commit r367288.

llvm-svn: 367289
2019-07-30 07:44:58 +00:00
Roman Lebedev c75cdd056f [DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.

If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.

https://bugs.llvm.org/show_bug.cgi?id=42673

Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner

Reviewed By: bogner

Subscribers: bogner, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65298

llvm-svn: 367288
2019-07-30 07:10:00 +00:00
Peter Collingbourne dd9682196b ThinLTOBitcodeWriter: Include globals associated with type metadata globals in the merged module.
Globals that are associated with globals with type metadata need to appear
in the merged module because they will reference the global's section directly.

Differential Revision: https://reviews.llvm.org/D65312

llvm-svn: 367242
2019-07-29 17:22:40 +00:00
Sanjay Patel e9ee7b47d4 [InstCombine] fold fadd+fneg with fdiv/fmul betweena
The backend already does this via isNegatibleForFree(),
but we may want to alter the fneg IR canonicalizations
that currently exist, so we need to try harder to fold
fneg in IR to avoid regressions.

llvm-svn: 367227
2019-07-29 13:50:25 +00:00
Sanjay Patel 5483f4225e [InstCombine] reduce code for fadd with fneg operand; NFC
llvm-svn: 367224
2019-07-29 13:20:46 +00:00
Sanjay Patel 99c57c6daf [InstCombine] fold fsub+fneg with fdiv/fmul between
The backend already does this via isNegatibleForFree(),
but we may want to alter the fneg IR canonicalizations
that currently exist, so we need to try harder to fold
fneg in IR to avoid regressions.

llvm-svn: 367194
2019-07-28 17:10:06 +00:00
Hideto Ueno e7bea9b73a [Attributor] Deduce "align" attribute
Summary:
Deduce "align" attribute in attributor.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64152

llvm-svn: 367187
2019-07-28 07:04:01 +00:00
Sanjay Patel 02b9e45a7e [InstSimplify] remove quadratic time looping (PR42771)
The test case from:
https://bugs.llvm.org/show_bug.cgi?id=42771
...shows a ~30x slowdown caused by the awkward loop iteration (rL207302) that is
seemingly done just to avoid invalidating the instruction iterator. We can instead
delay instruction deletion until we reach the end of the block (or we could delay
until we reach the end of all blocks).

There's a test diff here for a degenerate case with llvm.assume that is not
meaningful in itself, but serves to verify this change in logic.

This change probably doesn't result in much overall compile-time improvement
because we call '-instsimplify' as a standalone pass only once in the standard
-O2 opt pipeline currently.

Differential Revision: https://reviews.llvm.org/D65336

llvm-svn: 367173
2019-07-27 14:05:51 +00:00
Florian Hahn d89f6cb299 Revert [IPSCCP] Add assertion to surface cases where we zap returns with overdefined users.
This reverts r366998 (git commit 5354c83ece)

This breaks a linux kernel build and we have reproducer to investigate.

llvm-svn: 367160
2019-07-26 22:14:08 +00:00
Wei Mi 55a68a2400 [JumpThreading] Stop searching predecessor when the current bb is in a
unreachable loop.

updatePredecessorProfileMetadata in jumpthreading tries to find the
first dominating predecessor block for a PHI value by searching upwards
the predecessor block chain.

But jumpthreading may see some temporary IR state which contains
unreachable bb not being cleaned up. If an unreachable loop happens to
be on the predecessor block chain, keeping chasing the predecessor
block will run into an infinite loop.

The patch fixes it.

Differential Revision: https://reviews.llvm.org/D65310

llvm-svn: 367154
2019-07-26 20:59:22 +00:00
Sanjay Patel a9ab31558c [InstCombine] canonicalize negated operand of fdiv
This is a transform that we use with fmul, so use
it for fdiv too for consistency.

llvm-svn: 367146
2019-07-26 19:56:59 +00:00
Sanjay Patel c229cfeb7a [InstCombine] remove flop from lerp patterns
(Y * (1.0 - Z)) + (X * Z) -->
Y - (Y * Z) + (X * Z) -->
Y + Z * (X - Y)

This is part of solving:
https://bugs.llvm.org/show_bug.cgi?id=42716

Factoring eliminates an instruction, so that should be a good canonicalization.
The potential conversion to FMA would be handled by the backend based on target
capabilities.

Differential Revision: https://reviews.llvm.org/D65305

llvm-svn: 367101
2019-07-26 11:19:18 +00:00
Serguei Katkov 7f8c809592 [Loop Utils] Extend the scope of addStringMetadataToLoop.
To avoid duplicates in loop metadata, if the string to add is
already there, just update the value.

Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D65265

llvm-svn: 367087
2019-07-26 07:04:34 +00:00
Serguei Katkov 3c3a76527e [Loop Utils] Move utilty addStringMetadataToLoop to LoopUtils.cpp. NFC.
Just move the utility function to LoopUtils.cpp to re-use it in loop peeling.

Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, asbirlea, llvm-commits
Differential Revision: https://reviews.llvm.org/D65264

llvm-svn: 367085
2019-07-26 06:10:08 +00:00
Leonard Chan 007f674c6a Reland the "[NewPM] Port Sancov" patch from rL365838. No functional
changes were made to the patch since then.

--------

[NewPM] Port Sancov

This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

llvm-svn: 367053
2019-07-25 20:53:15 +00:00
Florian Hahn c74808b914 [PredicateInfo] Replace pointer comparisons with deterministic compares.
Currently there are a few pointer comparisons in ValueDFS_Compare, which
can cause non-deterministic ordering when materializing values. There
are 2 cases this patch fixes:

1. Order defs before uses used to compare pointers, which guarantees
   defs before uses, but causes non-deterministic ordering between 2
   uses or 2 defs, depending on the allocation order. By converting the
   pointers to booleans, we can circumvent that problem.

2. comparePHIRelated was comparing the basic block pointers of edges,
   which also results in a non-deterministic order and is also not
   really meaningful for ordering. By ordering by their destination DFS
   numbers we guarantee a deterministic order.

For the example below, we can end up with 2 different uselist orderings,
when running `opt -mem2reg -ipsccp` hundreds of times. Because the
non-determinism is caused by allocation ordering, we cannot reproduce it
with ipsccp alone.

    declare i32 @hoge() local_unnamed_addr #0

    define dso_local i32 @ham(i8* %arg, i8* %arg1) #0 {
    bb:
      %tmp = alloca i32
      %tmp2 = alloca i32, align 4
      br label %bb19

    bb4:                                              ; preds = %bb20
      br label %bb6

    bb6:                                              ; preds = %bb4
      %tmp7 = call i32 @hoge()
      store i32 %tmp7, i32* %tmp
      %tmp8 = load i32, i32* %tmp
      %tmp9 = icmp eq i32 %tmp8, 912730082
      %tmp10 = load i32, i32* %tmp
      br i1 %tmp9, label %bb11, label %bb16

    bb11:                                             ; preds = %bb6
      unreachable

    bb13:                                             ; preds = %bb20
      br label %bb14

    bb14:                                             ; preds = %bb13
      %tmp15 = load i32, i32* %tmp
      br label %bb16

    bb16:                                             ; preds = %bb14, %bb6
      %tmp17 = phi i32 [ %tmp10, %bb6 ], [ 0, %bb14 ]
      br label %bb19

    bb18:                                             ; preds = %bb20
      unreachable

    bb19:                                             ; preds = %bb16, %bb
      br label %bb20

    bb20:                                             ; preds = %bb19
      indirectbr i8* null, [label %bb4, label %bb13, label %bb18]
    }

Reviewers: davide, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D64866

llvm-svn: 367049
2019-07-25 20:48:13 +00:00
Serguei Katkov cde00c02e1 [Loop Peeling] Fix idom detection algorithm.
We'd like to determine the idom of exit block after peeling one iteration.
Let Exit is exit block.
Let ExitingSet - is a set of predecessors of Exit block. They are exiting blocks.
Let Latch' and ExitingSet' are copies after a peeling.
We'd like to find an idom'(Exit) - idom of Exit after peeling.
It is an evident that idom'(Exit) will be the nearest common dominator of ExitingSet and ExitingSet'.
idom(Exit) is a nearest common dominator of ExitingSet.
idom(Exit)' is a nearest common dominator of ExitingSet'.
Taking into account that we have a single Latch, Latch' will dominate Header and idom(Exit).
So the idom'(Exit) is nearest common dominator of idom(Exit)' and Latch'.
All these basic blocks are in the same loop, so what we find is
(nearest common dominator of idom(Exit) and Latch)'.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D65292

llvm-svn: 367044
2019-07-25 19:31:50 +00:00
Sanjay Patel b456310902 [SimplifyCFG] avoid crashing after simplifying a switch (PR42737)
Later code in TryToSimplifyUncondBranchFromEmptyBlock() assumes that
we have cleaned up unreachable blocks, but that was not happening
with this switch transform.

llvm-svn: 367037
2019-07-25 17:01:12 +00:00
JF Bastien dbc0a5df8d Allow prefetching from non-zero address spaces
Summary:
This is useful for targets which have prefetch instructions for non-default address spaces.

<rdar://problem/42662136>

Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65254

llvm-svn: 367032
2019-07-25 16:11:57 +00:00
Vlad Tsyrklevich 5d5a58317c Revert "[InstCombine] try to narrow a truncated load"
This reverts commit bc4a63fd3c, this is a
speculative revert to fix a number of sanitizer bots (like
sanitizer-x86_64-linux-bootstrap-ubsan) that have started to see stage2
compiler crashes, presumably due to a miscompile.

llvm-svn: 367029
2019-07-25 15:37:57 +00:00
Florian Hahn c0d0e3bda8 [PredicateInfo] Use SmallVector instead of SmallPtrSet.
We do not need the SmallPtrSet to avoid adding duplicates to
OpsToRename, because we already keep a ValueInfo mapping. If we see an
op for the first time, Infos will be empty and we can also add it to
OpsToRename.

We process operands by visiting BBs depth-first and then iterate over
all instructions & users, so the order should be deterministic.
Therefore we can skip one round of sorting, which we purely needed for
guaranteeing a deterministic order when iterating over the SmallPtrSet.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D64816

llvm-svn: 367028
2019-07-25 15:35:10 +00:00
Sanjay Patel 38a0200868 [Utils] remove duplicated documentation comments; NFC
http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments

llvm-svn: 367015
2019-07-25 13:11:21 +00:00
Sanjay Patel bc4a63fd3c [InstCombine] try to narrow a truncated load
trunc (load X) --> load (bitcast X to narrow type)

We have this transform in DAGCombiner::ReduceLoadWidth(), but the truncated
load pattern can interfere with other instcombine transforms, so I'd like to
allow the fold sooner.

Example:
https://bugs.llvm.org/show_bug.cgi?id=16739
...in that report, we have bitcasts bracketing these ops, so those could get
eliminated too.

We've generally ruled out widening of loads early in IR ( LoadCombine -
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105291.html ), but
that reasoning may not apply to narrowing if we can preserve information
such as the dereferenceable range.

Differential Revision: https://reviews.llvm.org/D64432

llvm-svn: 367011
2019-07-25 12:14:27 +00:00
Florian Hahn 5354c83ece [IPSCCP] Add assertion to surface cases where we zap returns with overdefined users.
We should only zap returns in functions, where all live users have a
replace-able value (are not overdefined). Unused return values should be
undefined.

This should make it easier to detect bugs like in PR42738.

Alternatively we could bail out of zapping the function returns, but I
think it would be better to address those divergences between function
and call-site values where they are actually caused.

Reviewers: davide, efriedma

Reviewed By: davide, efriedma

Differential Revision: https://reviews.llvm.org/D65222

llvm-svn: 366998
2019-07-25 09:37:09 +00:00
Sjoerd Meijer 5c606cef79 [LV] Scalar Epilogue Lowering. NFC.
This refactors boolean 'OptForSize' that was passed around in a lot of places.
It controlled folding of the tail loop, the scalar epilogue, into the main loop
but code-size reasons may not be the only reason to do this. Thus, this is a
first step to generalise the concept of tail-loop folding, and hence OptForSize
has been renamed and is using an enum ScalarEpilogueStatus that holds the
status how the epilogue should be lowered.

This will be followed up by D65197, that picks up the predicate loop hint and
performs the tail-loop folding.

Differential Revision: https://reviews.llvm.org/D64916

llvm-svn: 366993
2019-07-25 08:06:02 +00:00
Chen Zheng a2d74d3d90 [PowerPC] exclude more icmps in LSR which is converted in later hardware loop pass
Differential Revision: https://reviews.llvm.org/D64795

llvm-svn: 366976
2019-07-25 01:22:08 +00:00
Evandro Menezes 5cd5f9b65d [InstCombine] Swap order of checks to improve compile time (NFC)
llvm-svn: 366962
2019-07-24 23:31:04 +00:00
Sanjay Patel 86e9f9dc26 [Transforms] move copying of load metadata to helper function; NFC
There's another proposed load combine that can make use of this code
in D64432.

llvm-svn: 366949
2019-07-24 22:11:11 +00:00
Craig Topper e9abc8177a [InstCombine] Teach foldOrOfICmps to allow icmp eq MIN_INT/MAX to be part of a range comparision. Similar for foldAndOfICmps
We can treat icmp eq X, MIN_UINT as icmp ule X, MIN_UINT and allow
it to merge with icmp ugt X, C. Similar for the other constants.

We can do simliar for icmp ne X, (U)INT_MIN/MAX in foldAndOfICmps. And we already handled UINT_MIN there.

Fixes PR42691.

Differential Revision: https://reviews.llvm.org/D65017

llvm-svn: 366945
2019-07-24 20:57:29 +00:00
David Bolvansky d2904ccf88 Let CorrelatedValuePropagation preserve LazyValueInfo
Summary:
This patch makes CorrelatedValuePropagation preserve LazyValueInfo by adding LazyValueInfo::eraseValue & calling it whenever an instruction is erased.

Passes `make check` , test-suite, and SPECrate 2017.

Patch by aqjune (Juneyoung Lee)

Reviewers: reames, mzolotukhin

Reviewed By: reames

Subscribers: xbolva00, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59349

llvm-svn: 366942
2019-07-24 20:27:32 +00:00
Petr Hosek 8b161bacf4 [SafeStack] Insert the deref before remaining elements
This is a follow up to D64971. While we need to insert the deref after
the offset, it needs to come before the remaining elements in the
original expression since the deref needs to happen before the LLVM
fragment if present.

Differential Revision: https://reviews.llvm.org/D65172

llvm-svn: 366865
2019-07-24 00:16:23 +00:00
Philip Reames ea5c94b497 [IndVars] Fix a subtle bug in optimizeLoopExits
The original code failed to account for the fact that one exit can have a pointer exit count without all of them having pointer exit counts.  This could cause two separate bugs:
1) We might exit the loop early, and leave optimizations undone.  This is what triggered the assertion failure in the reported test case.
2) We might optimize one exit, then exit without indicating a change.  This could result in an analysis invalidaton bug if no other transform is done by the rest of indvars.

Note that the pointer exit counts are a really fragile concept.  They show up only when we have a pointer IV w/o a datalayout to provide their size.  It's really questionable to me whether the complexity implied is worth it.

llvm-svn: 366829
2019-07-23 17:45:11 +00:00
Simon Pilgrim 5d4bb8628c [SLPVectorizer] Revert local change that got accidently got committed in rL366799
This wasn't part of D63281

llvm-svn: 366807
2019-07-23 13:42:01 +00:00
Simon Pilgrim 743d45ee25 [TargetLowering] Add SimplifyMultipleUseDemandedBits
This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured.

The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use.

We do see a couple of regressions that need to be addressed:

    AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better).
	
    X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG.

The code owners have confirmed its ok for these cases to fixed up in future patches.

Differential Revision: https://reviews.llvm.org/D63281

llvm-svn: 366799
2019-07-23 12:39:08 +00:00
Simon Pilgrim 87adcf8c47 [SLPVectorizer] Remove null-pointer test. NFCI.
cast<CallInst> shouldn't return null and we dereference the pointer in a lot of other places, causing both MSVC + cppcheck to warn about dereferenced null pointers

llvm-svn: 366793
2019-07-23 10:51:43 +00:00
Hideto Ueno 9f5d80d79c [Attributor][NFC] Re-run clang-format on the Attributor.cpp
llvm-svn: 366789
2019-07-23 08:29:22 +00:00
Hideto Ueno 19c07afe17 [Attributor] Deduce "dereferenceable" attribute
Summary:
Deduce dereferenceable attribute in Attributor.

These will be added in a later patch.
* dereferenceable(_or_null)_globally (D61652)
* Deduction based on load instruction (similar to D64258)

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64876

llvm-svn: 366788
2019-07-23 08:16:17 +00:00
Robert Widmann fcf3c55a8c [LLVM-C] Improve Bindings to The Internalize Pass
Summary: Adds a binding to the internalize pass that allows the caller to pass a function pointer that acts as the visibility-preservation predicate.  Previously, one could only pass an unsigned value (not LLVMBool?) that directed the pass to consider "main" or not.

Reviewers: whitequark, deadalnix, harlanhaskins

Reviewed By: whitequark, harlanhaskins

Subscribers: kren1, hiraditya, llvm-commits, harlanhaskins

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62456

llvm-svn: 366777
2019-07-23 04:56:44 +00:00
Richard Trieu 3a52c3857f Inline function call into assert to fix unused variable warning.
llvm-svn: 366774
2019-07-23 03:10:06 +00:00
Stefan Stipanovic 6058b86373 Fixing build error from commit 95cbc3d
[Attributor] Liveness analysis.

Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored.
Right now we are only looking at noreturn calls.

Reviewers: jdoerfert, uenoku

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D64162

llvm-svn: 366769
2019-07-22 23:58:23 +00:00
Stefan Stipanovic 5a9ba27c71 Revert "Fixing build error from commit 9285295."
This reverts commit 95cbc3da88.

llvm-svn: 366759
2019-07-22 22:55:05 +00:00
Stefan Stipanovic 95cbc3da88 Fixing build error from commit 9285295.
[Attributor] Liveness analysis.

Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored.
Right now we are only looking at noreturn calls.

Reviewers: jdoerfert, uenoku

Subscribers: hiraditya, llvm-commits

Differential revision: https://reviews.llvm.org/D64162

llvm-svn: 366753
2019-07-22 22:10:59 +00:00
Roman Lebedev 3a94765bfc [NFC][PatternMatch] Refactor code into a proper "matcher for any integral constant"
Having it as a proper matcher is better for reusability elsewhere
(in a follow-up patch.)

llvm-svn: 366752
2019-07-22 22:09:24 +00:00
Eric Christopher 77dc6d2479 Temporarily Revert "[Attributor] Liveness analysis." as it's breaking the build.
This reverts commit 9285295f75.

llvm-svn: 366737
2019-07-22 21:04:23 +00:00
Stefan Stipanovic 9285295f75 [Attributor] Liveness analysis.
Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored.
Right now we are only looking at noreturn calls.

Reviewers: jdoerfert, uenoku

Subscribers: hiraditya, llvm-commits

Differential revision: https://reviews.llvm.org/D64162

llvm-svn: 366736
2019-07-22 20:54:30 +00:00
Stefan Stipanovic 69ebb02001 [Attributor] NoAlias on return values.
Porting function return value attribute noalias to attributor.
This will be followed with a patch for callsite and function argumets.

Reviewers: jdoerfert

Subscribers: lebedev.ri, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D63067

llvm-svn: 366728
2019-07-22 19:36:27 +00:00
Petr Hosek f6cd6ffbc9 [SafeStack] Insert the deref after the offset
While debugging code that uses SafeStack, we've noticed that LLVM
produces an invalid DWARF. Concretely, in the following example:

  int main(int argc, char* argv[]) {
    std::string value = "";
    printf("%s\n", value.c_str());
    return 0;
  }

DWARF would describe the value variable as being located at:

  DW_OP_breg14 R14+0, DW_OP_deref, DW_OP_constu 0x20, DW_OP_minus

The assembly to get this variable is:

  leaq    -32(%r14), %rbx

The order of operations in the DWARF symbols is incorrect in this case.
Specifically, the deref is incorrect; this appears to be incorrectly
re-inserted in repalceOneDbgValueForAlloca.

With this change which inserts the deref after the offset instead of
before it, LLVM produces correct DWARF:

  DW_OP_breg14 R14-32

Differential Revision: https://reviews.llvm.org/D64971

llvm-svn: 366726
2019-07-22 18:52:42 +00:00
Peter Collingbourne ef5cfc2dae WholeProgramDevirt: Teach the pass to respect the global's alignment.
The bytes inserted before an overaligned global need to be padded according
to the alignment set on the original global in order for the initializer
to meet the global's alignment requirements. The previous implementation
that padded to the pointer width happened to be correct for vtables on most
platforms but may do the wrong thing if the vtable has a larger alignment.

This issue is visible with a prototype implementation of HWASAN for globals,
which will overalign all globals including vtables to 16 bytes.

There is also no padding requirement for the bytes inserted after the global
because they are never read from nor are they significant for alignment
purposes, so stop inserting padding there.

Differential Revision: https://reviews.llvm.org/D65031

llvm-svn: 366725
2019-07-22 18:50:45 +00:00
Peter Collingbourne c3b8661df5 LowerTypeTests: Teach the pass to respect global alignments.
We were previously ignoring alignment entirely when combining globals
together in this pass. There are two main things that we need to do here:
add additional padding before each global to meet the alignment requirements,
and set the combined global's alignment to the maximum of all of the original
globals' alignments.

Since we now need to calculate layout as we go anyway, use the calculated
layout to produce GlobalLayout instead of using StructLayout.

Differential Revision: https://reviews.llvm.org/D65033

llvm-svn: 366722
2019-07-22 18:47:03 +00:00
Simon Pilgrim 3ebd2fe91a [SLPVectorizer] Fix some MSVC/cppcheck uninitialized variable warnings. NFCI.
llvm-svn: 366712
2019-07-22 17:57:36 +00:00
Christudasan Devadasan 006cf8c03d Added address-space mangling for stack related intrinsics
Modified the following 3 intrinsics:
int_addressofreturnaddress,
int_frameaddress & int_sponentry.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D64561

llvm-svn: 366679
2019-07-22 12:42:48 +00:00
Serguei Katkov c6c31da867 [Loop Peeling] Fix the handling of branch weights of peeled off branches.
Current algorithm to update branch weights of latch block and its copies is
based on the assumption that number of peeling iterations is approximately equal
to trip count.

However it is not correct. According to profitability check in one case we can decide to peel
in case it helps to reduce the number of phi nodes. In this case the number of peeled iteration
can be less then estimated trip count.

This patch introduces another way to set the branch weights to peeled of branches.
Let F is a weight of the edge from latch to header.
Let E is a weight of the edge from latch to exit.
F/(F+E) is a probability to go to loop and E/(F+E) is a probability to go to exit.
Then, Estimated TripCount = F / E.
For I-th (counting from 0) peeled off iteration we set the the weights for
the peeled latch as (TC - I, 1). It gives us reasonable distribution,
The probability to go to exit 1/(TC-I) increases. At the same time
the estimated trip count of remaining loop reduces by I.

As a result after peeling off N iteration the weights will be
(F - N * E, E) and trip count of loop becomes
F / E - N or TC - N.

The idea is taken from the review of the patch D63918 proposed by Philip.

Reviewers: reames, mkuper, iajbar, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64235

llvm-svn: 366665
2019-07-22 05:15:34 +00:00
Craig Topper e6cd20ba53 [InstCombine] Update comment I missed in r366649. NFC
llvm-svn: 366658
2019-07-21 16:15:03 +00:00
Craig Topper 1d149d08d3 [InstCombine] Remove insertRangeTest code that handles the equality case.
For equality, the function called getTrue/getFalse with the VT
of the comparison input. But getTrue/getFalse need the boolean VT.
So if this code ever executed, it would assert.

I believe these cases are removed by InstSimplify so we don't get here.

So this patch just fixes up an assert to exclude the equality
possibility and removes the broken code.

llvm-svn: 366649
2019-07-21 06:43:38 +00:00
Craig Topper 8fabdfe9fc [InstCombine] Don't use AddOne/SubOne to see if two APInts are 1 apart. Use APInt operations instead. NFCI
AddOne/SubOne create new Constant objects. That seems heavy for
comparing ConstantInts which wrap APInts. Just do the math on
on the APInts and compare them.

llvm-svn: 366648
2019-07-21 05:26:05 +00:00
Florian Hahn 0a7faa4e3d [Local] Zap blockaddress without users in ConstantFoldTerminator.
If the blockaddress is not destoryed, the destination block will still
be marked as having its address taken, limiting further transformations.

I think there are other places where the dead blockaddress constants are kept
around, I'll look into that as follow up.

Reviewers: craig.topper, brzycki, davide

Reviewed By: brzycki, davide

Differential Revision: https://reviews.llvm.org/D64936

llvm-svn: 366633
2019-07-20 12:25:47 +00:00
Hubert Tong 2711e16b35 [sanitizers] Use covering ObjectFormatType switches
Summary:
This patch removes the `default` case from some switches on
`llvm::Triple::ObjectFormatType`, and cases for the missing enumerators
(`UnknownObjectFormat`, `Wasm`, and `XCOFF`) are then added.

For `UnknownObjectFormat`, the effect of the action for the `default`
case is maintained; otherwise, where `llvm_unreachable` is called,
`report_fatal_error` is used instead.

Where the `default` case returns a default value, `report_fatal_error`
is used for XCOFF as a placeholder. For `Wasm`, the effect of the action
for the `default` case in maintained.

The code is structured to avoid strongly implying that the `Wasm` case
is present for any reason other than to make the switch cover all
`ObjectFormatType` enumerator values.

Reviewers: sfertile, jasonliu, daltenty

Reviewed By: sfertile

Subscribers: hiraditya, aheejin, sunfish, llvm-commits, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64222

llvm-svn: 366544
2019-07-19 08:46:18 +00:00
Serguei Katkov bde33af85a [Loop Peeling] Enable peeling of multiple exits by default.
Enable loop peeling with multiple exits where all non-latch exits
ends up with deopt by default.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: xbolva00, hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64619

llvm-svn: 366542
2019-07-19 08:35:45 +00:00
Roman Lebedev f2eb403144 [InstCombine] Dropping redundant masking before left-shift [5/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
f. `((x << MaskShAmt) a>> MaskShAmt) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
f. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

Normally, the inner pattern is sign-extend,
but for our purposes it's no different to other patterns:

alive proofs:
f: https://rise4fun.com/Alive/7U3

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64524

llvm-svn: 366540
2019-07-19 08:26:58 +00:00
Roman Lebedev 441c9d6ca8 [InstCombine] Dropping redundant masking before left-shift [4/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
e. `((x << MaskShAmt) l>> MaskShAmt) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
e. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

alive proofs:
e: https://rise4fun.com/Alive/0FT

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64521

llvm-svn: 366539
2019-07-19 08:26:47 +00:00
Roman Lebedev 3c212ce305 [InstCombine] Dropping redundant masking before left-shift [3/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
d. `(x & ((-1 << MaskShAmt) >> MaskShAmt)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
d. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

alive proofs:
d: https://rise4fun.com/Alive/I5Y

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64519

llvm-svn: 366538
2019-07-19 08:26:37 +00:00
Roman Lebedev 2ebe57386d [InstCombine] Dropping redundant masking before left-shift [2/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
c. `(x & (-1 >> MaskShAmt)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
c. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

alive proofs:
c: https://rise4fun.com/Alive/RgJh

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64517

llvm-svn: 366537
2019-07-19 08:26:25 +00:00
Roman Lebedev 4422a1657c [InstCombine] Dropping redundant masking before left-shift [1/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
b. `(x & (~(-1 << maskNbits))) << shiftNbits`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
b. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)`

alive proof:
b: https://rise4fun.com/Alive/y8M

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64514

llvm-svn: 366536
2019-07-19 08:26:13 +00:00
Roman Lebedev a5f0824eb5 [InstCombine] Dropping redundant masking before left-shift [0/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
a. `(x & ((1 << MaskShAmt) - 1)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
a. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)`

alive proof:
a: https://rise4fun.com/Alive/wi9

Indeed, not all of these patterns are canonical.
But since this fold will only produce a single instruction
i'm really interested in handling even uncanonical patterns,
since i have this general kind of pattern in hotpaths,
and it is not totally outlandish for bit-twiddling code.

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Reviewers: spatel, nikic, huihuiz, xbolva00

Reviewed By: xbolva00

Subscribers: efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64512

llvm-svn: 366535
2019-07-19 08:25:43 +00:00
Serguei Katkov 0ffa833d54 [LoopInfo] Use early return in branch weight update functions. NFC.
llvm-svn: 366411
2019-07-18 07:36:20 +00:00
Peter Collingbourne 3b82b92c6b hwasan: Initialize the pass only once.
This will let us instrument globals during initialization. This required
making the new PM pass a module pass, which should still provide access to
analyses via the ModuleAnalysisManager.

Differential Revision: https://reviews.llvm.org/D64843

llvm-svn: 366379
2019-07-17 21:45:19 +00:00
Hideto Ueno 4a09a73fb0 [Attributor][NFC] Remove unnecessary debug output
llvm-svn: 366373
2019-07-17 21:11:02 +00:00
Hideto Ueno 11d3710c1c [Attributor] Deduce "willreturn" function attribute
Summary:
Deduce the "willreturn" attribute for functions.

For now, intrinsics are not willreturn. More annotation will be done in another patch.

Reviewers: jdoerfert

Subscribers: jvesely, nhaehnle, nicholas, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63046

llvm-svn: 366335
2019-07-17 15:15:43 +00:00
Philip Reames 6e1c3bb181 [IndVars] Speculative fix for an assertion failure seen in bots
I don't have an IR sample which is actually failing, but the issue described in the comment is theoretically possible, and should be guarded against even if there's a different root cause for the bot failures.

llvm-svn: 366241
2019-07-16 18:23:49 +00:00
Amara Emerson 228a7b4f2a [ADCE] Fix non-deterministic behaviour due to iterating over a pointer set.
Original patch by Yann Laigle-Chapuy

Differential Revision: https://reviews.llvm.org/D64785

llvm-svn: 366215
2019-07-16 15:23:10 +00:00
Rui Ueyama 49a3ad21d6 Fix parameter name comments using clang-tidy. NFC.
This patch applies clang-tidy's bugprone-argument-comment tool
to LLVM, clang and lld source trees. Here is how I created this
patch:

$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ mkdir build
$ cd build
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \
    -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \
    -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \
    -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm
$ ninja
$ parallel clang-tidy -checks='-*,bugprone-argument-comment' \
    -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \
    ::: ../llvm/lib/**/*.{cpp,h} ../clang/lib/**/*.{cpp,h} ../lld/**/*.{cpp,h}

llvm-svn: 366177
2019-07-16 04:46:31 +00:00
Peter Collingbourne e5c4b468f0 hwasan: Pad arrays with non-1 size correctly.
Spotted by eugenis.

Differential Revision: https://reviews.llvm.org/D64783

llvm-svn: 366171
2019-07-16 03:25:50 +00:00
Eric Christopher 93dfb93ad6 Temporarily Revert "[SLP] Recommit: Look-ahead operand reordering heuristic."
As there are some reported miscompiles with AVX512 and performance regressions
in Eigen. Verified with the original committer and testcases will be forthcoming.

This reverts commit r364964.

llvm-svn: 366154
2019-07-15 23:36:02 +00:00
Leonard Chan bb147aabc6 Revert "[NewPM] Port Sancov"
This reverts commit 5652f35817.

llvm-svn: 366153
2019-07-15 23:18:31 +00:00
Evgeniy Stepanov c5e7f56249 ARM MTE stack sanitizer.
Add "memtag" sanitizer that detects and mitigates stack memory issues
using armv8.5 Memory Tagging Extension.

It is similar in principle to HWASan, which is a software implementation
of the same idea, but there are enough differencies to warrant a new
sanitizer type IMHO. It is also expected to have very different
performance properties.

The new sanitizer does not have a runtime library (it may grow one
later, along with a "debugging" mode). Similar to SafeStack and
StackProtector, the instrumentation pass (in a follow up change) will be
inserted in all cases, but will only affect functions marked with the
new sanitize_memtag attribute.

Reviewers: pcc, hctim, vitalybuka, ostannard

Subscribers: srhines, mehdi_amini, javed.absar, kristof.beyls, hiraditya, cryptoad, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64169

llvm-svn: 366123
2019-07-15 20:02:23 +00:00
Johannes Doerfert 3dcd7996f1 [FunctionAttrs] Remove readonly and writeonly assertion
There are scenarios where mutually recursive functions may cause the SCC
to contain both read only and write only functions. This removes an
assertion when adding read attributes which caused a crash with a the
provided test case, and instead just doesn't add the attributes.

Patch by Luke Lau <luke.lau@intel.com>

Differential Revision: https://reviews.llvm.org/D60761

llvm-svn: 366090
2019-07-15 17:31:26 +00:00