Commit Graph

23123 Commits

Author SHA1 Message Date
Johannes Doerfert 1e46eb74be [Attributor][FIX] Avoid dangling value pointers during code modification
When we replace instructions with unreachable we delete instructions. We
now avoid dangling pointers to those deleted instructions in the
`ToBeChangedToUnreachableInsts` set. Other modification collections
might need to be updated in the future as well.
2020-01-08 19:32:37 -06:00
Kazu Hirata 2d258ed931 Revert "[JumpThreading] Thread jumps through two basic blocks"
It looks like my patch breaks the sanitizer-windows build:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/56324

This reverts commit ead815924e.
2020-01-08 13:58:39 -08:00
Kazu Hirata ead815924e [JumpThreading] Thread jumps through two basic blocks
Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247
2020-01-08 06:57:36 -08:00
Kadir Cetinkaya b212eb7159
Revert "[InstCombine] fold zext of masked bit set/clear"
This reverts commit a041c4ec6f.

This looks like a non-trivial change and there has been no code
reviews (at least there were no phabricator revisions attached to the
commit description). It is also causing a regression in one of our
downstream integration tests, we haven't been able to come up with a
minimal reproducer yet.
2020-01-08 11:21:21 +01:00
Philip Reames 312a532dc0 [GVN/FP] Considate logic for reasoning about equality vs equivalance for floats
Factor out common logic into some reasonable commented helper functions. In the process, ensure that the in-block vs cross-block cases are handled the same. They previously weren't.

Differential Revision: https://reviews.llvm.org/D67126
2020-01-07 16:05:04 -08:00
Sanjay Patel f8962571f7 [InstCombine] try to pull 'not' of select into compare operands
not (select ?, (cmp TPred, ?, ?), (cmp FPred, ?, ?) -->
     select ?, (cmp TPred', ?, ?), (cmp FPred', ?, ?)

If both sides of the select are cmps, we can remove an instruction.
The case where only side is a cmp is deferred to a possible
follow-on patch.

We have a more general 'isFreeToInvert' analysis, but I'm not seeing
a way to use that more widely without inducing infinite looping
(opposing transforms).
Here, we flip the compare predicates directly, so we should not have
any danger by creating extra intermediate 'not' ops.

Alive proofs:
https://rise4fun.com/Alive/jKa

Name: both select values are compares - invert predicates
  %tcmp = icmp sle i32 %x, %y
  %fcmp = icmp ugt i32 %z, %w
  %sel = select i1 %cond, i1 %tcmp, i1 %fcmp
  %not = xor i1 %sel, true
=>
  %tcmp_not = icmp sgt i32 %x, %y
  %fcmp_not = icmp ule i32 %z, %w
  %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not

Name: false val is compare - invert/not
  %fcmp = icmp ugt i32 %z, %w
  %sel = select i1 %cond, i1 %tcmp, i1 %fcmp
  %not = xor i1 %sel, true
=>
  %tcmp_not = xor i1 %tcmp, -1
  %fcmp_not = icmp ule i32 %z, %w
  %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not

Differential Revision: https://reviews.llvm.org/D72007
2020-01-07 10:44:23 -05:00
Simon Pilgrim bd1dc6a3eb Fix "use of uninitialized variable" static analyzer warnings. NFCI. 2020-01-07 10:55:37 +00:00
Fangrui Song 6904cd9486 Add Triple::isX86()
Reviewed By: craig.topper, skan

Differential Revision: https://reviews.llvm.org/D72247
2020-01-06 15:51:02 -08:00
James Henderson d68904f957 [NFC] Fix trivial typos in comments
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D72143

Patch by Kazuaki Ishizaki.
2020-01-06 10:50:26 +00:00
Brian Gesiak 83a9321f60 [Coroutines] Remove corresponding phi values when apply simplifyTerminatorLeadingToRet
Summary:
In addMustTailToCoroResumes, we set musttail on those resume instructions that are followed by a ret instruction. This is done by simplifyTerminatorLeadingToRet which replace a sequence of branches leading to a ret with a clone of the ret.

However it forgets to remove corresponding PHI values that come from basic block of replaced branch, and may cause jumpthreading pass hangs (https://bugs.llvm.org/show_bug.cgi?id=43720)

This patch fix this issue

Test Plan:
cppcoro library with O3+flto
check-llvm

Reviewers: modocache, GorNishanov, lewissbaker

Reviewed By: modocache

Subscribers: mehdi_amini, EricWF, hiraditya, dexonsmith, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71826

Patch by junparser (JunMa)!
2020-01-05 18:26:30 -05:00
Florian Hahn b8a3c34eee Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)."
This reverts commit 51ef53f3bd, as it
breaks some bots.
2020-01-04 18:44:38 +00:00
Florian Hahn 51ef53f3bd [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-01-04 18:29:35 +00:00
Florian Hahn 99f74a64a2 [SCEV] Remove unused ScalarEvolutionExpander.h includes (NFC). 2020-01-04 18:29:35 +00:00
Roman Lebedev 6d05bc2e3a
[NFCI][InstCombine] Refactor 'sink negation into select if that folds one hand of select to 0' fold
I would think it's better than having two practically identical folds
next to eachother, but then generalization isn't all that pretty
due to the fact that we need to produce different `sub` each time..

This change is no-functional-changes-intended refactoring.
2020-01-04 17:30:51 +03:00
Roman Lebedev 772ede3d5d
[InstCombine] Sink sub into hands of select if one hand becomes zero. Part 2 (PR44426)
This decreases use count of %Op0, makes one hand of select to be 0,
and possibly exposes further folding potential.

Name: sub %Op0, (select %Cond, %Op0, %FalseVal) -> select %Cond, 0, (sub %Op0, %FalseVal)
  %Op0 = %TrueVal
  %o = select i1 %Cond, i8 %Op0, i8 %FalseVal
  %r = sub i8 %Op0, %o
=>
  %n = sub i8 %Op0, %FalseVal
  %r = select i1 %Cond, i8 0, i8 %n

Name: sub %Op0, (select %Cond, %TrueVal, %Op0) -> select %Cond, (sub %Op0, %TrueVal), 0
  %Op0 = %FalseVal
  %o = select i1 %Cond, i8 %TrueVal, i8 %Op0
  %r = sub i8 %Op0, %o
=>
  %n = sub i8 %Op0, %TrueVal
  %r = select i1 %Cond, i8 %n, i8 0

https://rise4fun.com/Alive/aHRt

https://bugs.llvm.org/show_bug.cgi?id=44426
2020-01-04 17:30:51 +03:00
Roman Lebedev 4d8e47ca18
[InstCombine] Sink sub into hands of select if one hand becomes zero (PR44426)
This decreases use count of %Op1, makes one hand of select to be 0,
and possibly exposes further folding potential.

Name: sub (select %Cond, %Op1, %FalseVal), %Op1 -> select %Cond, 0, (sub %FalseVal, %Op1)
  %Op1 = %TrueVal
  %o = select i1 %Cond, i8 %Op1, i8 %FalseVal
  %r = sub i8 %o, %Op1
=>
  %n = sub i8 %FalseVal, %Op1
  %r = select i1 %Cond, i8 0, i8 %n

Name: sub (select %Cond, %TrueVal, %Op1), %Op1 -> select %Cond, (sub %TrueVal, %Op1), 0
  %Op1 = %FalseVal
  %o = select i1 %Cond, i8 %TrueVal, i8 %Op1
  %r = sub i8 %o, %Op1
=>
  %n = sub i8 %TrueVal, %Op1
  %r = select i1 %Cond, i8 %n, i8 0

https://rise4fun.com/Alive/avL

https://bugs.llvm.org/show_bug.cgi?id=44426
2020-01-04 17:30:51 +03:00
Alexey Lapshin 831bfcea47 [Transforms][GlobalSRA] huge array causes long compilation time and huge memory usage.
Summary:
For artificial cases (huge array, few usages), Global SRA optimization creates
a lot of redundant data. It creates an instance of GlobalVariable for each array
element. For huge array, that means huge compilation time and huge memory usage.
Following example compiles for 10 minutes and requires 40GB of memory.

namespace {
  char LargeBuffer[64 * 1024 * 1024];
}

int main ( void ) {

    LargeBuffer[0] = 0;

    printf("\n ");

    return LargeBuffer[0] == 0;
}

The fix is to avoid Global SRA for large arrays.

Reviewers: craig.topper, rnk, efriedma, fhahn

Reviewed By: rnk

Subscribers: xbolva00, lebedev.ri, lkail, merge_guards_bot, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71993
2020-01-04 16:42:38 +03:00
Roman Lebedev 7973aa05f6
[NFC][InstCombine] '(Op1 & С) - Op1' -> '-(Op1 & ~C)' fold (PR44427)
This decreases use count of Op1, potentially allows
us to further hoist said 'neg' later on,
and results in marginally better X86 codegen.

Name: (Op1 & С) - Op1 -> -(Op1 & ~C)
  %o = and i64 %Op1, C1
  %r = sub i64 %o, %Op1
=>
  %n = and i64 %Op1, ~C1
  %r = sub i64 0, %n

https://rise4fun.com/Alive/rwgA

https://godbolt.org/z/R_RMfM

https://bugs.llvm.org/show_bug.cgi?id=44427
2020-01-03 21:25:48 +03:00
Roman Lebedev cc0216bedb
[NFC][InstCombine] '(X & (- Y)) - X' -> '- (X & (Y - 1))' fold (PR44448)
Name: (X & (- Y)) - X  ->  - (X & (Y - 1))  (PR44448)
  %negy = sub i8 0, %y
  %unbiasedx = and i8 %negy, %x
  %r = sub i8 %unbiasedx, %x
=>
  %ymask = add i8 %y, -1
  %xmasked = and i8 %ymask, %x
  %r = sub i8 0, %xmasked

https://rise4fun.com/Alive/OIpla

This decreases use count of %x, may allow us to
later hoist said negation even further,
and results in marginally nicer X86 codegen.

See
  https://bugs.llvm.org/show_bug.cgi?id=44448
  https://reviews.llvm.org/D71499
2020-01-03 20:27:29 +03:00
Johannes Doerfert d2d2fb19f7 [Attributor][FIX] Allow dead users of rewritten function
If we replace a function with a new one because we rewrite the
signature, dead users may still refer to the old version. With this
patch we reuse the code that deals with dead functions, which the old
versions are, to avoid problems.
2020-01-03 10:43:40 -06:00
Johannes Doerfert 6b9ee2d6cd [Attributor][NFC] Unify the way we delete dead functions 2020-01-03 10:43:40 -06:00
Johannes Doerfert c90681b681 [Attributor][FIX] Don't crash on ptr2int/int2ptr instructions
An integer isn't allowed in getAlignmentForValue so we need to stop at a
ptr2int instruction during exploration.
2020-01-03 10:43:40 -06:00
Johannes Doerfert 412a0101a9 [Attributor][FIX] Do not derive nonnull and dereferenceable w/o access
An inbounds GEP results in poison if the value is not "inbounds", not in
UB. We accidentally derived nonnull and dereferenceable from these
inbounds GEPs even in the absence of accesses that would make the poison
to UB.
2020-01-03 10:43:40 -06:00
Johannes Doerfert a4b3588ba2 [Attributor][FIX] Return CHANGED once a pessimistic fixpoint is reached. 2020-01-03 10:43:40 -06:00
Ankit 369a919514 Fix for a dangling point bug in DeadStoreElimination pass
The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction.

While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration.

In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container.

Reviewers: fhahn, bcahoon, efriedma

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D65326
2020-01-03 14:28:44 +00:00
Sanjay Patel 1640582743 [InstCombine] replace undef elements in vector constant when doing icmp folds (PR44383)
As shown in P44383:
https://bugs.llvm.org/show_bug.cgi?id=44383
...we can't safely propagate a vector constant through this icmp fold
if that vector constant contains undefined elements.

We know that each defined element of the constant is safe though, so
find the first of those and replicate it into the formerly undef lanes.

Differential Revision: https://reviews.llvm.org/D72101
2020-01-03 09:16:57 -05:00
Hideto Ueno 5fc02dc0a7 Revert "[Attributor] AAValueConstantRange: Value range analysis using constant range"
This reverts commit e996303431.
2020-01-03 11:03:56 +09:00
Sanjay Patel 88fc5fdef6 [InstCombine] remove uses before deleting instructions (PR43723)
This is a less ambitious alternative to previous attempts to fix
this bug with:
rG56b2aee1875a
rGef02831f0a4e
rG56b2aee1875a
...because those all failed bot testing with use-after-free or
other problems.

The original crashing/assert problem is still showing up on
various fuzzers, so I've added a new minimal test based on
another one of those failures.

Instead of trying to manage and coordinate the logic in
isAllocSiteRemovable() with the deletion loops, just loosen
the existing code that handles casts and GEP by replacing
with undef to allow other opcodes. That means that no
instructions with uses should assert on deletion, and there
are hopefully no non-obvious sanitizer bugs induced.
2020-01-02 09:47:36 -05:00
Brian Gesiak 9ce0ff2eef [Coroutines] const-ify internal helpers (NFC)
Several helpers internal to llvm/Transforms/Coroutines do not use
'const' for parameters that are not modified. Add const where possible.
2020-01-01 21:57:49 -05:00
Brian Gesiak 2fcf7691df [Coroutines] Rename "legacy" passes (NFC)
A series of patches beginning with https://reviews.llvm.org/D71898
propose to add an implementation of the coroutine passes to the new pass
manager. As part of these changes, the coroutine passes that implement
the legacy pass manager interface are renamed, to `<PassName>Legacy`.
This mirrors similar changes that have been made to many other passes in
LLVM as they've been transitioned to support both old and new pass
managers.

This commit splits out the renaming portion of that patch and commits it
in advance as an NFC (no functional change intended) commit. It renames:

* `CoroEarly` => `CoroEarlyLegacy`
* `CoroSplit` => `CoroSplitLegacy`
* `CoroElide` => `CoroElideLegacy`
* `CoroCleanup` => `CoroCleanupLegacy`
2020-01-01 21:41:16 -05:00
Nikita Popov 8dd9a13619 [InstCombine] Preserve inbounds when merging with zero-index GEP (PR44423)
This addresses https://bugs.llvm.org/show_bug.cgi?id=44423.
If one of the GEPs is inbounds and the other is zero-index,
we can also preserve inbounds.

Differential Revision: https://reviews.llvm.org/D72060
2020-01-01 23:04:28 +01:00
Nikita Popov 6ba5f8c4ac [InstCombine] Fix incorrect inbounds on GEP of GEP (PR44425)
This fixes https://bugs.llvm.org/show_bug.cgi?id=44425. We need to
drop inbounds if one of the GEPs is not inbounds. This was already
done when creating a new GEP, but not when modifying in place.

Differential Revision: https://reviews.llvm.org/D72059
2020-01-01 22:10:55 +01:00
Hideto Ueno e996303431 [Attributor] AAValueConstantRange: Value range analysis using constant range
This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point.
One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument).

The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty).

Currently, AAValueConstantRange is created when AAValueSimplify cannot
simplify the value.

Supported
 - BinaryOperator(add, sub, ...)
 - CmpInst(icmp eq, ...)
 - !range metadata

`AAValueConstantRange` is not intended to extend to polyhedral range value analysis.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D71620
2020-01-01 15:35:56 +09:00
Craig Topper 374e0299cf [X86][InstCombine] Add constant folding and simplification support for pdep and pext
The instructions use a mask to either pack disjoint bits together(pext) or spread bits to disjoint locations(pdep). If the mask is all 0s then no bits are extracted or deposited. If the mask is all ones, then the source value is written to the result since no compression or expansion happens. Otherwise if both the source and mask are constant we can walk the bits in the source/mask and calculate the result.

There other crazier things we could do like computeKnownBits or turning pext into shift/and if only a single contiguous range of bits is extracted.

Fixes PR44389

Differential Revision: https://reviews.llvm.org/D71952
2019-12-31 15:06:47 -08:00
Sanjay Patel a041c4ec6f [InstCombine] fold zext of masked bit set/clear
This does not solve PR17101, but it is one of the
underlying diffs noted here:
https://bugs.llvm.org/show_bug.cgi?id=17101#c8

We could ease the one-use checks for the 'clear'
(no 'not' op) half of the transform, but I do not
know if that asymmetry would make things better
or worse.

Proofs:
https://rise4fun.com/Alive/uVB

  Name: masked bit set
  %sh1 = shl i32 1, %y
  %and = and i32 %sh1, %x
  %cmp = icmp ne i32 %and, 0
  %r = zext i1 %cmp to i32
  =>
  %s = lshr i32 %x, %y
  %r = and i32 %s, 1

  Name: masked bit clear
  %sh1 = shl i32 1, %y
  %and = and i32 %sh1, %x
  %cmp = icmp eq i32 %and, 0
  %r = zext i1 %cmp to i32
  =>
  %xn = xor i32 %x, -1
  %s = lshr i32 %xn, %y
  %r = and i32 %s, 1
2019-12-31 12:35:10 -05:00
Nikita Popov 7adb5c2aca Revert "[InstCombine] Fix infinite loop due to bitcast <-> phi transforms"
This reverts commit 27a0795943.

Seems to break test-suite.
2019-12-31 17:42:57 +01:00
Nikita Popov 27a0795943 [InstCombine] Fix infinite loop due to bitcast <-> phi transforms
Fix for https://bugs.llvm.org/show_bug.cgi?id=44245.

The optimizeBitCastFromPhi() and FoldPHIArgOpIntoPHI() end up
fighting against each other, because optimizeBitCastFromPhi()
assumes that bitcasts of loads will get folded. This doesn't happen
here, because a dangling phi node prevents the one-use fold in
https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L620-L628 from triggering.

This patch fixes the issue by adding manually removing the old phis.

Differential Revision: https://reviews.llvm.org/D71164
2019-12-31 16:17:14 +01:00
Connor Abbott fb114694e9 [InstCombine] Don't rewrite phi-of-bitcast when the phi has other users
Judging by the existing comments, this was the intention, but the
transform never actually checked if the existing phi's would be removed.
See https://bugs.llvm.org/show_bug.cgi?id=44242 for an example where
this causes much worse code generation on AMDGPU.

Differential Revision: https://reviews.llvm.org/D71209
2019-12-31 12:15:02 +01:00
Ilya Biryukov 4f82af81a0 [Attributor] Suppress unused warnings when assertions are disabled. NFC 2019-12-31 10:21:52 +01:00
Johannes Doerfert 751336340d [Attributor] Function signature rewrite infrastructure
As part of the Attributor manifest we want to change the signature of
functions. This patch introduces a fairly generic interface to do so.
As a first, very simple, use case, we remove unused arguments. A second
use case, pointer privatization, will be committed with this patch as
well.

A lot of the code and ideas are taken from argument promotion and we
run all argument promotion tests through this framework as well.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D68765
2019-12-31 02:31:33 -06:00
Johannes Doerfert dada8132af [Attributor] Propagate known align from arguments to call sites arguments
Since the information is known we can simply use it at the call site.
This is especially useful for callbacks but also helps regular calls.

The test changes are mechanical.
2019-12-31 01:33:22 -06:00
Johannes Doerfert b1b441d22d [Attributor] Use abstract call sites to determine associated arguments
This is the second step after D67871 to make use of abstract call sites.
In this patch the argument we associate with a abstract call site
argument can be the one in the callback callee instead of the one in the
callback broker.

Caveat: We cannot allow no-alias arguments for problematic callbacks:
As described in [1], adding no-alias (or restrict) to arguments could
break synchronization as the synchronization effect, e.g., a barrier,
does not "alias" with the pointer anymore. This disables no-alias
annotation for potentially problematic arguments until we implement the
fix described in [1].

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D68008

[1] Compiler Optimizations for OpenMP, J. Doerfert and H. Finkel,
    International Workshop on OpenMP 2018,
    http://compilers.cs.uni-saarland.de/people/doerfert/par_opt18.pdf
2019-12-31 01:33:22 -06:00
Johannes Doerfert 2888019871 [Attributor] Annotate the memory behavior of call site arguments
Especially for callbacks, annotating the call site arguments is
important. Doing so exposed a too strong dependence of AAMemoryBehavior
on AANoCapture since we handle the case of potentially captured pointers
explicitly.

The changes to the tests are all mechanical.
2019-12-31 01:33:21 -06:00
Sanjay Patel 987eb8e26c [InstCombine] propagate sign argument through nested copysigns
This is another optimization suggested in PR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153
2019-12-30 11:06:02 -05:00
Evgeniy Brevnov 948e745270 [LV][NFC] Keep dominator tree up to date during vectorization. 2019-12-30 18:38:41 +07:00
Evgeniy Brevnov 1b6286b945 [LV][NFC] Some refactoring and renaming to facilitate next change. 2019-12-30 18:38:41 +07:00
Hideto Ueno 34fe8d0451 [Attributor] Use `changeUseAfterManifest` in AAValueSimplify manifest
Summary: This patch makes `AAValueSimplify` use `changeUsesAfterManifest` in `manifest`. This will invoke simple folding after the manifest.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71972
2019-12-30 17:08:48 +09:00
Hideto Ueno ef4febd85b [Attributor] AAUndefinedBehavior: Check for branches on undef value.
A branch is considered UB if it depends on an undefined / uninitialized value.
At this point this handles simple UB branches in the form: `br i1 undef, ...`
We query `AAValueSimplify` to get a value for the branch condition, so the branch
can be more complicated than just: `br i1 undef, ...`.

Patch By: Stefanos Baziotis (@baziotis)

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D71799
2019-12-29 17:43:00 +09:00
Gil Rapaport d62bf16131 [LV] Use getMask() when printing recipe [NFCI]
Use dedicated API for getting the mask instead of duplicating it.

Differential Revision: https://reviews.llvm.org/D71964
2019-12-29 08:50:40 +02:00
Florian Hahn dc2c9b0fcf [Matrix] Propagate and use shape info for binary operators.
This patch extends the current shape propagation and shape aware
lowering to also support binary operators. Those operators are uniform
with respect to their shape (shape of the input operands is the same as
the shape of their result).

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70898
2019-12-27 15:50:47 +00:00
Fangrui Song 7a7334663c Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821
Intrinsic has incorrect argument type!
  i32 (i32*)* @llvm.setjmp

*wipes tear*
2019-12-27 00:00:14 -08:00
Hideto Ueno cb5eb13eaf [Attributor] Add helper to change an instruction to `unreachable` inst
Summary: Calling `changeToUnreachable` in `manifest` from different places might cause really unpredictable problems. As other deleting functions are doing, we need to change these instructions after all `manifest`.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71910
2019-12-27 02:39:37 +09:00
Whitney Tsang d1f41b2ca9 [NFC][LoopFusion] Fix printing of the guard branch.
Reviewer: kbarton, jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71878
2019-12-26 02:45:29 +00:00
Hideto Ueno 1d5d074aef [Attributor] Reach optimistic fixpoint in AAValueSimplify when the value is constant or undef
Summary:
As discussed in D71799, we have found that it is more useful to reach an optimistic fixpoint in AAValueSimpify when the value is constant or undef.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: baziotis, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71852
2019-12-25 14:18:34 +09:00
Johannes Doerfert 5732f56bbd [Attributor] UB Attribute now handles all instructions that access memory through a pointer
Summary:
Follow-up on: https://reviews.llvm.org/D71435
We basically use `checkForAllInstructions` to loop through all the instructions in a function that access memory through a pointer: load, store, atomicrmw, atomiccmpxchg
Note that we can now use the `getPointerOperand()` that gets us the pointer operand for an instruction that belongs to the aforementioned set.

Question: This function returns `nullptr` if the instruction is `volatile`. Why?
Guess:  Because if it is volatile, we don't want to do any transformation to it.

Another subtle point is that I had to add AtomicRMW, AtomicCmpXchg to `initializeInformationCache()`. Following `checkAllInstructions()` path, that
seemed the most reasonable place to add it and correct the fact that these instructions were ignored (they were not in `OpcodeInstMap` etc.). Is that ok?

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert, sstefan1

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71787
2019-12-24 19:25:08 -06:00
Johannes Doerfert 58f324a468 [Attributor] Function level undefined behavior attribute
_Eventually_, this attribute will be assigned to a function if it
contains undefined behavior. As a first small step, I tried to make it
loop through the load instructions in a function (eventually, the plan
is to check if a load instructions causes undefined behavior, because
e.g. dereferences a null pointer - Also eventually, this won't happen in
initialize() but in updateImpl()).

Patch By: Stefanos Baziotis (@baziotis)

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D71435
2019-12-24 19:23:08 -06:00
Florian Hahn 8d6f59b78a [Matrix] Use fmuladd for matrix.multiply if allowed.
If the matrix.multiply calls have the contract fast math flag, we can
use fmuladd. This als adds a command line option to force fmuladd
generation. We can retire this option once there is a clang-level
option.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70951
2019-12-23 14:49:14 +01:00
Florian Hahn 109e4e3851 [Matrix] Add forward shape propagation and first shape aware lowerings.
This patch adds infrastructure for forward shape propagation to
LowerMatrixIntrinsics. It also updates the pass to make use of
the shape information to break up larger vector operations and to
eliminate unnecessary conversion operations between columnwise matrixes
and flattened vectors: if shape information is available for an
instruction, lower the operation to a set of instructions operating on
columns. For example, a store of a matrix is broken down into separate
stores for each column. For users that do not have shape
information (e.g. because they do not yet support shape information
aware lowering), we pack the result columns into a flat vector and
update those users.

It also adds shape aware lowering for the first non-intrinsic
instruction: vector stores.

Example:

For
  %c  = call <4 x double> @llvm.matrix.transpose(<4 x double> %a, i32 2, i32 2)
  store <4 x double> %c, <4 x double>* %Ptr

We generate the code below without shape propagation. Note %9 which
combines the columns of the transposed matrix into a flat vector.

  %split = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 0, i32 1>
  %split1 = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 2, i32 3>
  %1 = extractelement <2 x double> %split, i64 0
  %2 = insertelement <2 x double> undef, double %1, i64 0
  %3 = extractelement <2 x double> %split1, i64 0
  %4 = insertelement <2 x double> %2, double %3, i64 1
  %5 = extractelement <2 x double> %split, i64 1
  %6 = insertelement <2 x double> undef, double %5, i64 0
  %7 = extractelement <2 x double> %split1, i64 1
  %8 = insertelement <2 x double> %6, double %7, i64 1
  %9 = shufflevector <2 x double> %4, <2 x double> %8, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
  store <4 x double> %9, <4 x double>* %Ptr

With this patch, we propagate the 2x2 shape information from the
transpose to the store and we generate the code below. Note that we
store the columns directly and do not need an extra shuffle.

  %9 = bitcast <4 x double>* %Ptr to double*
  %10 = bitcast double* %9 to <2 x double>*
  store <2 x double> %4, <2 x double>* %10, align 8
  %11 = getelementptr double, double* %9, i32 2
  %12 = bitcast double* %11 to <2 x double>*
  store <2 x double> %8, <2 x double>* %12, align 8

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70897
2019-12-23 13:51:56 +01:00
Dinar Temirbulatov a755ccefe6 [SLP] Replace NeedToGather variable with enum. 2019-12-23 08:21:53 +01:00
Mark de Wever 098d3347e7 [Transforms] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Differential Revision: https://reviews.llvm.org/D71810
2019-12-22 19:20:17 +01:00
Sanjay Patel 9cdcd81d3f [InstCombine] enhance fold for copysign with known sign arg
This is another optimization suggested in PRPR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153
2019-12-22 10:07:01 -05:00
Sanjay Patel 79c7fa31f3 [InstCombine] check alloc size in bitcast of geps fold (PR44321)
We missed a constraint in D44833
when folding a bitcast into a GEP with vector/array types.
If the alloc sizes specified by the datalayout don't match,
this could miscompile as shown in:
https://bugs.llvm.org/show_bug.cgi?id=44321

Differential Revision: https://reviews.llvm.org/D71771
2019-12-21 10:31:21 -05:00
Sanjay Patel 19f9f374d9 [SimplifyLibCalls] require fast-math-flags for pow(X, -0.5) transforms
As discussed in PR44330:
https://bugs.llvm.org/show_bug.cgi?id=44330
...the transform from pow(X, -0.5) libcall/intrinsic to
reciprocal square root can result in small deviations from
the expected result due to differences in the pow()
implementation and/or the extra rounding step from the division.

This patch proposes to allow that difference with either the
'approximate functions' or 'reassociate' FMF:
http://llvm.org/docs/LangRef.html#fast-math-flags

In practice, this likely means that the code is compiled with
all of 'fast' (-ffast-math), but I have preserved the existing
specializations for -0.0/-INF that enable generating safe code
if those special values are allowed simultaneously with
allowing approximation/reassociation.

The question about whether a similar restriction is needed for
the non-reciprocal case -- pow(X, 0.5) -- is deferred. That
transform is allowed without FMF currently, and this patch does
not change that behavior.

Differential Revision: https://reviews.llvm.org/D71706
2019-12-21 10:00:53 -05:00
Jakub Kuderski c431c407eb [InstCombine] Improve infinite loop detection
Summary:
This patch limits the default number of iterations performed by InstCombine. It also exposes a new option that allows to specify how many iterations is considered getting stuck in an infinite loop.

Based on experiments performed on real-world C++ programs, InstCombine seems to perform at most ~8-20 iterations, so treating 1000 iterations as an infinite loop seems like a safe choice. See D71145 for details.

The two limits can be specified via command line options.

Reviewers: spatel, lebedev.ri, nikic, xbolva00, grosser

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71673
2019-12-20 16:15:04 -05:00
Ayal Zaks e498be5738 [LV] Strip wrap flags from vectorized reductions
A sequence of additions or multiplications that is known not to wrap, may wrap
if it's order is changed (i.e., reassociated). Therefore when vectorizing
integer sum or product reductions, their no-wrap flags need to be removed.

Fixes PR43828

Patch by Denis Antrushin

Differential Revision: https://reviews.llvm.org/D69563
2019-12-20 14:48:53 +02:00
Vedant Kumar caaacb8399 HotColdSplitting: Do not outline within noreturn functions
A function marked `noreturn` may contain unreachable terminators: these
should not be considered cold, as the function may be a trampoline.

rdar://58068594
2019-12-19 14:06:24 -08:00
Bjorn Pettersson 89e3bb4502 [ConstantHoisting] Ignore unreachable bb:s when collecting candidates
Summary:
Ignore looking at blocks that are unreachable from entry when
collecting candidates for hosting.

Normally the consthoist pass is executed in the llc pipeline,
just after unreachableblockelim. So it is abnormal to have code
that is unreachable from the entry block. But when running the
pass as part of opt, for example as part of fuzzy testing, we
might trigger various kinds of asserts when collecting candidates
if we include unreachable blocks in that analysis.

It seems like a waste of time to hoist constants in unreachble
blocks, so the solution is to simply ignore such blocks when
collecting the hoisting candidates.

The two added test cases used to end up in two different asserts,
and the intention with the checks is just to verify that we no
longer fail.

Fixes: PR43903

Reviewers: spatel

Reviewed By: spatel

Subscribers: hiraditya, uabelho, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71678
2019-12-19 15:07:55 +01:00
David Green a59cc5e128 [InstCombine] Canonicalize select immediates
In certain situations after inlining and simplification we end up with
code that is _almost_ a min/max pattern, but contains constants that
have been demand-bit optimised to the wrong values, ending up with code
like:
  %1 = icmp slt i32 %shr, -128
  %2 = select i1 %1, i32 128, i32 %shr
  %.inv = icmp sgt i32 %shr, 127
  %spec.select.i = select i1 %.inv, i32 127, i32 %2
  %conv7 = trunc i32 %spec.select.i to i8
This should be turned into a min/max pattern, but the -128 in the first
select was instead transformed into 128, as only the bottom byte was
ever demanded.

To fix this, I've put in further canonicalisation for the immediates of
selects, preferring to use the same value as the icmp if available.

Differential Revision: https://reviews.llvm.org/D71516
2019-12-19 12:36:46 +00:00
Piotr Sobczak 40b5a0f7c8 Revert "[InstCombine][AMDGPU] Trim more components of *buffer_load"
Revert D70315, as it breaks gfx8 for some reason.

This reverts commit 65f94b3380.
2019-12-18 22:04:44 +01:00
Kit Barton 3db1cf7a1e [LoopFusion] Use the LoopInfo::isRotatedForm method (NFC).
Loop fusion previously had a method to check whether a loop was in rotated form. This method has
been moved into the LoopInfo class. This patch removes the old isRotated method from loop fusion,
in favour of the new one in LoopInfo.
2019-12-18 15:04:25 -05:00
Jakub Kuderski 3d29c41ad5 [InstCombine] Insert instructions before adding them to worklist
Summary:
This patch adds instructions to the InstCombine worklist after they are properly inserted. This way we don't get `<badref>`s printed when logging added instructions.
It also adds a check in `Worklist::Add` that ensures that all added instructions have parents.

Simple test case that illustrates the difference when run with `--debug-only=instcombine`:

```
define i32 @test35(i32 %a, i32 %b) {
  %1 = or i32 %a, 1135
  %2 = or i32 %1, %b
  ret i32 %2
}
```

Before this patch:
```
INSTCOMBINE ITERATION #1 on test35
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %1 = or i32 %a, 1135
IC: Visiting:   %2 = or i32 %1, %b
IC: ADD:   %2 = or i32 %a, %b
IC: Old =   %3 = or i32 %1, %b
    New =   <badref> = or i32 %2, 1135
IC: ADD:   <badref> = or i32 %2, 1135
...
```

With this patch:
```
INSTCOMBINE ITERATION #1 on test35
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %1 = or i32 %a, 1135
IC: Visiting:   %2 = or i32 %1, %b
IC: ADD:   %2 = or i32 %a, %b
IC: Old =   %3 = or i32 %1, %b
    New =   <badref> = or i32 %2, 1135
IC: ADD:   %3 = or i32 %2, 1135
...
```

Reviewers: fhahn, davide, spatel, foad, grosser, nikic

Reviewed By: nikic

Subscribers: nikic, lebedev.ri, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71093
2019-12-18 14:55:41 -05:00
Jakub Kuderski 406b6019cd [InstCombine] Allow to limit the max number of iterations
Summary:
This patch teaches InstCombine to accept a new parameter: maximum number of iterations over functions.

InstCombine tries to simplify instructions by iterating over the whole function until the function stops changing. As a consequence, the last iteration before reaching a fixpoint visits all instructions in the worklist and never performs any rewrites.

Bounding the number of iterations can have 2 benefits:
* In case the users of the pass can make a good guess about the number of required iterations, we can save the time normally spent on the last iteration that doesn't change anything.
* When the wants to use InstCombine as a cleanup pass, it may be enough to run just a few iterations and stop even before reaching a fixpoint. This can be also useful for implementing a lightweight pass pipeline (think `-O1`).

This patch does not change the behavior of opt or Clang -- limiting the number of iterations is entirely opt-in.

Reviewers: fhahn, davide, spatel, foad, nlopes, grosser, lebedev.ri, nikic, xbolva00

Reviewed By: spatel

Subscribers: craig.topper, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71145
2019-12-18 13:48:54 -05:00
stozer 89d19d60ad Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel
This reverts commit 1f3dd83cc1, reapplying
commit bb1b0bc4e5.

The original commit failed on some builds seemingly due to the use of a
bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.
2019-12-18 16:26:42 +00:00
Whitney Tsang 9883d7edc6 [LoopUtils] Updated deleteDeadLoop() to handle loop nest.
Reviewer: kariddi, sanjoy, reames, Meinersbur, bmahjour, etiotto,
kbarton
Reviewed By: Meinersbur
Subscribers: mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D70939
2019-12-18 15:59:45 +00:00
stozer 1f3dd83cc1 Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel"
Reverted due to build failure on windows bots.

This reverts commit bb1b0bc4e5.
2019-12-18 11:46:10 +00:00
stozer bb1b0bc4e5 [DebugInfo] Correctly handle salvaged casts and split fragments at ISel
Previously, LLVM had no functional way of performing casts inside of a
DIExpression(), which made salvaging cast instructions other than Noop
casts impossible. This patch enables the salvaging of casts by using the
DW_OP_LLVM_convert operator for SExt and Trunc instructions.

There is another issue which is exposed by this fix, in which fragment
DIExpressions (which are preserved more readily by this patch) for
values that must be split across registers in ISel trigger an assertion,
as the 'split' fragments extend beyond the bounds of the fragment
DIExpression causing an error. This patch also fixes this issue by
checking the fragment status of DIExpressions which are to be split, and
dropping fragments that are invalid.
2019-12-18 11:09:18 +00:00
Anna Welker 7cd1cfdd6b [NFC][TTI] Add Alignment for isLegalMasked[Gather/Scatter]
Add an extra parameter so alignment can be taken under
consideration in gather/scatter legalization.

Differential Revision: https://reviews.llvm.org/D71610
2019-12-18 09:14:39 +00:00
Whitney Tsang 36bdc3dc35 [LoopFusion] Move instructions from FC0.Latch to FC1.Latch.
Summary:This PR move instructions from FC0.Latch bottom up to the
beginning of FC1.Latch as long as they are proven safe.

To illustrate why this is beneficial, let's consider the following
example:
Before Fusion:
header1:
  br header2
header2:
  br header2, latch1
latch1:
  br header1, preheader3
preheader3:
  br header3
header3:
  br header4
header4:
  br header4, latch3
latch3:
  br header3, exit3

After Fusion (before this PR):
header1:
  br header2
header2:
  br header2, latch1
latch1:
  br header3
header3:
  br header4
header4:
  br header4, latch3
latch3:
  br header1, exit3

Note that preheader3 is removed during fusion before this PR.
Notice that we cannot fuse loop2 with loop4 as there exists block latch1
in between.
This PR move instructions from latch1 to beginning of latch3, and remove
block latch1. LoopFusion is now able to fuse loop nest recursively.

After Fusion (after this PR):
header1:
  br header2
header2:
  br header3
header3:
  br header4
header4:
  br header2, latch3
latch3:
  br header1, exit3

Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel,
bmahjour, etiotto
Reviewed By: kbarton, Meinersbur
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71165
2019-12-17 22:10:23 +00:00
Stefan Stipanovic fff8ec9813 [Attributor] H2S fix.
Summary: Fixing issues that were noticed in D71521

Reviewers: jdoerfert, lebedev.ri, uenoku

Subscribers:

Differential Revision: https://reviews.llvm.org/D71564
2019-12-17 20:41:09 +01:00
Piotr Sobczak 65f94b3380 [InstCombine][AMDGPU] Trim more components of *buffer_load
Summary:
Add trimming of unused components of s_buffer_load.

Extend trimming of *buffer_load to also include
unused components at the beginning of vectors and update offset.

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70315
2019-12-17 17:50:07 +01:00
Guillaume Chatelet 531c1161b9 Resubmit "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove"
Summary:
This is a resubmit of D71473.

This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients.
Functions will be deprecated one by one and as in tree code is cleaned up.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: aaron.ballman, courbet

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71547
2019-12-17 10:07:46 +01:00
Whitney Tsang ec4749e3b8 Revert "[LoopUtils] Updated deleteDeadLoop() to handle loop nest."
This reverts commit cd09fee3d6.
This reverts commit c066ff11d8.
2019-12-17 03:51:41 +00:00
Johannes Doerfert 0bc3336ac1 [Attributor][NFC] Clang format the Attributor
The Attributor is always kept formatted so diffs are cleaner.

Sometime we get out of sync for various reasons so we need to format the
file once in a while.
2019-12-16 21:03:18 -06:00
Whitney Tsang c066ff11d8 [LoopUtils] Updated deleteDeadLoop() to handle loop nest.
Reviewer: kariddi, sanjoy, reames, Meinersbur, bmahjour, etiotto,
kbarton
Reviewed By: Meinersbur
Subscribers: mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D70939
2019-12-17 01:06:14 +00:00
Kit Barton ff07fc66d9 [LoopFusion] Restrict loop fusion to rotated loops.
Summary:
This patch restricts loop fusion to only consider rotated loops as valid candidates.
This simplifies the analysis and transformation and aligns with other loop optimizations.

Reviewers: jdoerfert, Meinersbur, dmgreen, etiotto, Whitney, fhahn, hfinkel

Reviewed By: Meinersbur

Subscribers: ormris, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71025
2019-12-16 15:17:29 -05:00
Craig Topper 02f644c59a [InstCombine] Teach removeBitcastsFromLoadStoreOnMinMax not to change the size of a store.
We can change the type as long as we don't change the size.

Fixes PR44306

Differential Revision: https://reviews.llvm.org/D71532
2019-12-16 12:12:54 -08:00
Guillaume Chatelet 4658da10e4 Revert "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove"
This reverts commit 181ab91efc.
2019-12-16 15:19:49 +01:00
Guillaume Chatelet 181ab91efc [Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove
Summary:
This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients.
Functions will be deprecated one by one and as in tree code is cleaned up.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71473
2019-12-16 13:35:55 +01:00
Bjorn Pettersson e5f07080b8 [BasicBlockUtils] Fix dbg.value elimination problem in MergeBlockIntoPredecessor
Summary:
In commit d60f34c20a (llvm-svn 317128,
PR35113) MergeBlockIntoPredecessor was changed into
discarding some dbg.value intrinsics referring to
PHI values, post-splice due to loop rotation.

That elimination of dbg.value intrinsics did not
consider which dbg.value to keep depending on the
context (e.g. if the variable is changing its value
several times inside the basic block).

In the past that hasn't been such a big problem since
CodeGenPrepare::placeDbgValues has moved the dbg.value
to be next to the PHI node anyway. But after commit
00e238896c CodeGenPrepare isn't doing that
any longer, so we need to be more careful when avoiding
duplicate dbg.value intrinsics in MergeBlockIntoPredecessor.

This patch replaces the code that tried to avoid duplicate
dbg.values by using the RemoveRedundantDbgInstrs helper.

Reviewers: aprantl, jmorse, vsk

Reviewed By: aprantl, vsk

Subscribers: jholewinski, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71480
2019-12-16 11:41:21 +01:00
Bjorn Pettersson 1c49553c19 [BasicBlockUtils] Add utility to remove redundant dbg.value instrs
Summary:
Add a RemoveRedundantDbgInstrs to BasicBlockUtils with the
goal to remove redundant dbg intrinsics from a basic block.

This can be useful after various transforms, as it might
be simpler to do a filtering of dbg intrinsics after the
transform than during the transform.
One primary use case would be to replace a too aggressive
removal done by MergeBlockIntoPredecessor, seen at loop
rotate (not done in this patch).

The elimination algorithm currently focuses on dbg.value
intrinsics and is doing two iterations over the BB.

First we iterate backward starting at the last instruction
in the BB. Whenever a consecutive sequence of dbg.value
instructions are found we keep the last dbg.value for
each variable found (variable fragments are identified
using the  {DILocalVariable, FragmentInfo, inlinedAt}
triple as given by the DebugVariable helper class).

Next we iterate forward starting at the first instruction
in the BB. Whenever we find a dbg.value describing a
DebugVariable (identified by {DILocalVariable, inlinedAt})
we save the {DIValue, DIExpression} that describes that
variables value. But if the variable already was mapped
to the same {DIValue, DIExpression} pair we instead drop
the second dbg.value.

To ease the process of making lit tests for this utility a
new pass is introduced called RedundantDbgInstElimination.
It can be executed by opt using -redundant-dbg-inst-elim.

Reviewers: aprantl, jmorse, vsk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71478
2019-12-16 11:41:21 +01:00
Johannes Doerfert 139c9ef45a [Attributor] Annotate call sites of declarations with a callback
Even if a declaration is called, if there is a callback we might need
the information during CG-SCC traversal (D70767).
2019-12-13 23:51:59 -06:00
Johannes Doerfert 3d347e2835 [Attributor][NFC] Simplify debug printing for abstract attributes
This also fixes a type in the debug printing of AANoAlias.
2019-12-13 23:51:59 -06:00
Johannes Doerfert 5d34602da4 [Attributor] Only replace instruction operands
This was part of D70767. When we replace the value of (call/invoke)
instructions we do not want to disturb the old call graph so we will
only replace instruction uses until we get rid of the old PM.

Accepted as part of D70767.
2019-12-13 22:16:38 -06:00
Francesco Petrogalli 19f73f0d1b Revert "[VectorUtils] Introduce the Vector Function Database (VFDatabase)."
This reverts commit 0be81968a2.

The VFDatabase needs some rework to be able to handle vectorization
and subsequent scalarization of intrinsics in out-of-tree versions of
the compiler. For more details, see the discussion in
https://reviews.llvm.org/D67572.
2019-12-13 19:42:04 +00:00
Fangrui Song 193da743db [profile] Fix a crash when -fprofile-remapping-file= triggers an error
Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D71485
2019-12-13 11:38:20 -08:00
Hiroshi Yamauchi ed50e6060b [PGO][PGSO] Enable size optimizations in code gen / target passes for cold code.
Summary: Split off of D67120.

Reviewers: davidxl

Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71288
2019-12-13 11:01:19 -08:00
Nicola Zaghen 97572775d2 Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same.
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

This fixes the buildbot failures.

Differential Revision: https://reviews.llvm.org/D68328

Patch by Joseph Faulls!
2019-12-13 14:30:21 +00:00
Evgenii Stepanov dabd2622a8 hwasan: add tag_offset DWARF attribute to optimized debug info
Summary:
Support alloca-referencing dbg.value in hwasan instrumentation.
Update AsmPrinter to emit DW_AT_LLVM_tag_offset when location is in
loclist format.

Reviewers: pcc

Subscribers: srhines, aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70753
2019-12-12 16:18:54 -08:00
Johannes Doerfert 6abd01e462 [Attributor][FIX] Do treat byval arguments special
When we reason about the pointer argument that is byval we actually
reason about a local copy of the value passed at the call site. This was
not the case before and we wrongly introduced attributes based on the
surrounding function.

AAMemoryBehaviorArgument, AAMemoryBehaviorCallSiteArgument and
AANoCaptureCallSiteArgument are made aware of byval now. The code
to skip "subsuming positions" for reasoning follows a common pattern and
we should refactor it. A TODO was added.

Discovered by @efriedma as part of D69748.
2019-12-12 16:04:21 -06:00
Florian Hahn 526244b187 [Matrix] Add first set of matrix intrinsics and initial lowering pass.
This is the first patch adding an initial set of matrix intrinsics and a
corresponding lowering pass. This has been discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html

The first patch introduces four new intrinsics (transpose, multiply,
columnwise load and store) and a LowerMatrixIntrinsics pass, that
lowers those intrinsics to vector operations.

Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix
embedded in a <16 x float> vector) and the intrinsics take the dimension
information as parameters. Those parameters need to be ConstantInt.
For the memory layout, we initially assume column-major, but in the RFC
we also described how to extend the intrinsics to support row-major as
well.

For the initial lowering, we split the input of the intrinsics into a
set of column vectors, transform those column vectors and concatenate
the result columns to a flat result vector.

This allows us to lower the intrinsics without any shape propagation, as
mentioned in the RFC. In follow-up patches, we plan to submit the
following improvements:
 * Shape propagation to eliminate the embedding/splitting for each
   intrinsic.
 * Fused & tiled lowering of multiply and other operations.
 * Optimization remarks highlighting matrix expressions and costs.
 * Generate loops for operations on large matrixes.
 * More general block processing for operation on large vectors,
   exploiting shape information.

We would like to add dedicated transpose, columnwise load and store
intrinsics, even though they are not strictly necessary. For example, we
could instead emit a large shufflevector instruction instead of the
transpose. But we expect that to
  (1) become unwieldy for larger matrixes (even for 16x16 matrixes,
      the resulting shufflevector masks would be huge),
  (2) risk instcombine making small changes, causing us to fail to
      detect the transpose, preventing better lowerings

For the load/store, we are additionally planning on exploiting the
intrinsics for better alias analysis.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70456
2019-12-12 15:42:18 +00:00