Commit Graph

15629 Commits

Author SHA1 Message Date
Roman Lebedev 59387c0dd7
[InstCombine] (-NSW x) s<= x --> x s>= 0 (PR39480)
Name: (-x) s<= x  -->  x >= 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp sle i8 %neg_x, %x
  =>
%r = icmp sge i8 %x, 0

https://rise4fun.com/Alive/91k

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:35 +03:00
Roman Lebedev 01a6c4bd26
[InstCombine] (-NSW x) s< x --> x s> 0 (PR39480)
Name: (-x) s< x  -->  x > 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp slt i8 %neg_x, %x
  =>
%r = icmp sgt i8 %x, 0

https://rise4fun.com/Alive/3IXb

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:35 +03:00
Roman Lebedev 3885207651
[InstCombine] (-NSW x) s>= x --> x s<= 0 (PR39480)
Name: (-x) s>= x  -->  x s<= 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp sge i8 %neg_x, %x
  =>
%r = icmp sle i8 %x, 0

https://rise4fun.com/Alive/Hdip

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:34 +03:00
Roman Lebedev 8878b79cfe
[InstCombine] (-NSW x) ==/!= x --> x ==/!= 0 (PR39480)
Name: (-x) == x  -->  x == 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp eq i8 %neg_x, %x
  =>
%r = icmp eq i8 %x, 0

Name: (-x) != x  -->  x != 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp ne i8 %neg_x, %x
  =>
%r = icmp ne i8 %x, 0

https://rise4fun.com/Alive/4slH

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:34 +03:00
Roman Lebedev 5060f5682b
[InstCombine] (-NSW x) s> x --> x s< 0 (PR39480)
Name: (-x) s> x  -->  x s< 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp sgt i8 %neg_x, %x
  =>
%r = icmp slt i8 %x, 0

https://rise4fun.com/Alive/ZslD

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:34 +03:00
Roman Lebedev 664e1784cd
[NFC][InstCombine] Add tests for comparisons between x and negation of x (PR39480) 2020-08-06 11:50:34 +03:00
Chuanqi Xu 92f1f1e40d [Coroutines] Use to collect lifetime marker of in CoroFrame Differential Revision: https://reviews.llvm.org/D85279 2020-08-06 14:21:55 +08:00
Douglas Yung bac1a0839f Fix typo in test. Thanks to Andrew Ng for spotting this! 2020-08-05 22:56:15 -07:00
Juneyoung Lee 9f717d7b94 [JumpThreading] Allow duplicating a basic block into preds when its branch condition is freeze(phi)
This is the last JumpThreading patch for getting the performance numbers shown at
https://reviews.llvm.org/D84940#2184653 .

This patch makes ProcessBlock call ProcessBranchOnPHI when the branch condition
is freeze(phi) as well (originally it calls the function when the condition is
phi only).

Since what ProcessBranchOnPHI does is to duplicate the basic block into
predecessors if profitable, it is still valid when the condition is freeze(phi)
too.

```
    p = phi [a, pred1] [b, pred2]
    p.fr = freeze p
    br p.fr, ...
=>
  pred1:
    p.fr = freeze a
    br p.fr, ...
  pred2:
    p.fr2 = freeze b
    br p.fr2, ...
```

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85029
2020-08-06 09:51:17 +09:00
Juneyoung Lee fd86d67b82 [JumpThreading] Add a test that duplicates insts of a basic block with cond br to preds; NFC 2020-08-06 09:19:23 +09:00
Sanjay Patel c66169136f [InstCombine] fold icmp with 'mul nsw/nuw' and constant operands
This also removes a more specific fold that only handled icmp with 0.

https://rise4fun.com/Alive/sdM9

  Name: mul nsw with icmp eq
  Pre: (C1 != 0) && (C2 % C1) == 0
  %a = mul nsw i8 %x, C1
  %r = icmp eq i8 %a, C2
    =>
  %r = icmp eq i8 %x, C2 / C1

  Name: mul nuw with icmp eq
  Pre: (C1 != 0) && (C2 %u C1) == 0
  %a = mul nuw i8 %x, C1
  %r = icmp eq i8 %a, C2
    =>
  %r = icmp eq i8 %x, C2 /u C1

  Name: mul nsw with icmp ne
  Pre: (C1 != 0) && (C2 % C1) == 0
  %a = mul nsw i8 %x, C1
  %r = icmp ne i8 %a, C2
    =>
  %r = icmp ne i8 %x, C2 / C1

  Name: mul nuw with icmp ne
  Pre: (C1 != 0) && (C2 %u C1) == 0
  %a = mul nuw i8 %x, C1
  %r = icmp ne i8 %a, C2
    =>
  %r = icmp ne i8 %x, C2 /u C1
2020-08-05 17:29:32 -04:00
Sanjay Patel 0315571a19 [InstCombine] add tests for icmp with mul nsw/nuw; NFC 2020-08-05 17:07:22 -04:00
Arthur Eubanks 9e6a1e5781 [NewPM][LoopRotate] Rename rotate -> loop-rotate
To match legacy pass name.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85338
2020-08-05 12:25:01 -07:00
Roman Lebedev f3056dcc02
[InstCombine] Negator: -(cond ? x : -x) --> cond ? -x : x
We were errneously only doing that for old-style abs/nabs,
but we have no such legality check on the condition of the select.

https://rise4fun.com/Alive/xBHS
2020-08-05 21:47:30 +03:00
Roman Lebedev 1d25d0734a
[NFC][InstCombine] Add tests for negation of old-style [n]abs, select-of-op-vs-negation-of-op 2020-08-05 21:47:30 +03:00
Sanjay Patel e8760bb9a8 [InstSimplify] fold icmp with mul nsw and constant operands
https://rise4fun.com/Alive/slvl

  Name: mul nsw with icmp eq
  Pre: (C2 % C1) != 0
  %a = mul nsw i8 %x, C1
  %r = icmp eq i8 %a, C2
    =>
  %r = false

  Name: mul nsw with icmp ne
  Pre: (C2 % C1) != 0
  %a = mul nsw i8 %x, C1
  %r = icmp ne i8 %a, C2
    =>
  %r = true

Follow-up to the 'nuw' variation added with:
rGf879c9b79621
2020-08-05 14:38:39 -04:00
Sanjay Patel f879c9b796 [InstSimplify] fold icmp with mul nuw and constant operands
https://rise4fun.com/Alive/pZEr

  Name: mul nuw with icmp eq
  Pre: (C2 %u C1) != 0
  %a = mul nuw i8 %x, C1
  %r = icmp eq i8 %a, C2
    =>
  %r = false

  Name: mul nuw with icmp ne
  Pre: (C2 %u C1) != 0
  %a = mul nuw i8 %x, C1
  %r = icmp ne i8 %a, C2
    =>
  %r = true

There are potentially several other transforms we need to add based on:
D51625
...but it doesn't look like there was follow-up to that patch.
2020-08-05 14:32:17 -04:00
Sanjay Patel a569a0af0d [InstSimplify] add vector tests for icmp with mul nuw; NFC
Also, the naming was off on a couple of tests.
2020-08-05 14:32:17 -04:00
Jordan Rupprecht 3c39db0c44 Revert "[LoopVectorizer] Inloop vector reductions"
This reverts commit e9761688e4. It breaks the build:

```
~/src/llvm-project/llvm/lib/Analysis/IVDescriptors.cpp:868:10: error: no viable conversion from returned value of type 'SmallVector<[...], 8>' to function return type 'SmallVector<[...], 4>'
  return ReductionOperations;
```
2020-08-05 10:24:15 -07:00
David Green e9761688e4 [LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this
case) can take two 128bit vectors, sign extend the inputs to i32,
multiplying them together and sum the result into a 32bit general
purpose register. So taking 16 i8's as inputs, they can multiply and
accumulate the result into a single i32 without any rounding/truncating
along the way. There are also reduction instructions for plain integer
add and min/max, and operations that sum into a pair of 32bit registers
together treated as a 64bit integer (even though MVE does not have a
plain 64bit addition instruction). So giving the vectorizer the ability
to use these instructions both enables us to vectorize at higher
bitwidths, and to vectorize things we previously could not.

In order to do that we need a way to represent that the reduction
operation, specified with a llvm.experimental.vector.reduce when
vectorizing for Arm, occurs inside the loop not after it like most
reductions. This patch attempts to do that, teaching the vectorizer
about in-loop reductions. It does this through a vplan recipe
representing the reductions that the original chain of reduction
operations is replaced by. Cost modelling is currently just done through
a prefersInloopReduction TTI hook (which follows in a later patch).

Differential Revision: https://reviews.llvm.org/D75069
2020-08-05 18:14:05 +01:00
Roman Lebedev 3a3c9519e2
[InstCombine] Negator: 0 - (X + Y) --> (-X) - Y iff a single operand negated
This was the most obvious regression in
f5df5cd5586ae9cfb2d9e53704dfc76f47aff149.f5df5cd5586ae9cfb2d9e53704dfc76f47aff149

We really don't want to do this if the original/outermost subtraction
isn't a negation, and therefore doesn't go away - just sinking negation
isn't a win. We are actually appear to be missing folds so hoist it.

https://rise4fun.com/Alive/tiVe
2020-08-05 20:01:13 +03:00
Roman Lebedev 26f79e258f
[NFC][InstCombine] Tests for negation of `add` w/ single negatible operand 2020-08-05 20:01:13 +03:00
Sanjay Patel 719954eacb [InstSimplify] add tests for icmp with 'mul nuw' operand; NFC 2020-08-05 12:46:45 -04:00
Roman Lebedev f5df5cd558
Recommit "[InstCombine] Negator: -(X << C) --> X * (-1 << C)"
This reverts commit ac70b37a00
which reverted commit 8aeb2fe13a
because codegen tests got broken and i needed time to investigate.

This shows some regressions in tests, but they are all around GEP's,
so i'm not really sure how important those are.

https://rise4fun.com/Alive/1Gn
2020-08-05 15:59:13 +03:00
Yevgeny Rouban bc10888dcd DomTree: Make PostDomTree indifferent to block successors swap
Fixed the commit c35585e209.

This is a fix for the bug 46098 where PostDominatorTree
is unexpectedly changed by InstCombine's branch swapping
transformation.
This patch fixes PostDomTree builder. While looking for
the furthest away node in a reverse unreachable subgraph
this patch runs DFS with successors in their function order.
This order is indifferent to the order of successors, so is
the furthest away node.

Reviewers: kuhar, nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D84763
2020-08-05 14:26:32 +07:00
Juneyoung Lee e0d99e9aaf [JumpThreading] Consider freeze as a zero-cost instruction
This is a simple patch that makes freeze as a zero-cost instruction, as bitcast already is.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85023
2020-08-05 14:42:36 +09:00
Juneyoung Lee 3401f9706b [JumpThreading] Add a test for D85023; NFC 2020-08-05 13:40:14 +09:00
Mehdi Amini 1366d66a22 Revert "DomTree: Make PostDomTree immune to block successors swap"
This reverts commit c35585e209.

The MLIR is broken with this patch, reproduce by adding
-DLLVM_ENABLE_PROJECTS=mlir to the cmake configuration and
build `ninja tools/mlir/lib/IR/CMakeFiles/obj.MLIRIR.dir/Dominance.cpp.o`
2020-08-05 04:32:44 +00:00
Yevgeny Rouban c35585e209 DomTree: Make PostDomTree immune to block successors swap
This is another fix for the bug 46098 where PostDominatorTree
is unexpectedly changed by InstCombine's branch swapping
transformation.
This patch fixes PostDomTree builder. While looking for
the furthest away node in a reverse unreachable subgraph
this patch runs DFS with successors in their function order.
This order is indifferent to the order of successors, so is
the furthest away node.

Reviewers: kuhar, nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D84763
2020-08-05 11:06:54 +07:00
Roman Lebedev ac70b37a00
Revert "[InstCombine] Negator: -(X << C) --> X * (-1 << C)"
Breaks codegen tests, will recommit later.

This reverts commit 8aeb2fe13a.
2020-08-05 03:19:38 +03:00
Roman Lebedev 8aeb2fe13a
[InstCombine] Negator: -(X << C) --> X * (-1 << C)
This shows some regressions in tests, but they are all around GEP's,
so i'm not really sure how important those are.

https://rise4fun.com/Alive/1Gn
2020-08-05 03:13:14 +03:00
Roman Lebedev 8fd57b06a4
[NFC][InstCombine] Fix value names (s/%tmp/%i/) and autogenerate a few tests being affected by negator change 2020-08-05 03:12:14 +03:00
Roman Lebedev 3f3303324e
[NFC][InstCombine] Negator: add tests for negation of left-shift by constant 2020-08-05 03:12:14 +03:00
Adrian Prantl bf82ff61a6 Teach SROA to handle allocas with more than one dbg.declare.
It is technically legal for optimizations to create an alloca that is
used by more than one dbg.declare, if one or both of them are inlined
instances of aliasing variables.

Differential Revision: https://reviews.llvm.org/D85172
2020-08-04 15:54:51 -07:00
AK f0f68c6e6c [HotColdSplit] Add test case for unlikely attribute in outlined function
Differential Revision: https://reviews.llvm.org/D85232
2020-08-04 13:16:33 -07:00
Xavier Denis 29fe3fe615 [InstSimplify] Peephole optimization for icmp (urem X, Y), X
This revision adds the following peephole optimization
and it's negation:

    %a = urem i64 %x, %y
    %b = icmp ule i64 %a, %x
    ====>
    %b = true

With John Regehr's help this optimization was checked with Alive2
which suggests it should be valid.

This pattern occurs in the bound checks of Rust code, the program

    const N: usize = 3;
    const T = u8;

    pub fn split_mutiple(slice: &[T]) -> (&[T], &[T]) {
        let len = slice.len() / N;
        slice.split_at(len * N)
    }

the method call slice.split_at will check that len * N is within
the bounds of slice, this bounds check is after some transformations
turned into the urem seen above and then LLVM fails to optimize it
any further. Adding this optimization would cause this bounds check
to be fully optimized away.

ref: https://github.com/rust-lang/rust/issues/74938

Differential Revision: https://reviews.llvm.org/D85092
2020-08-04 20:48:37 +02:00
Xavier Denis b778b04b69 [InstSimplify] Add tests for icmp with urem divisor (NFC) 2020-08-04 20:45:20 +02:00
Nikita Popov 4564974504 [SCCP] Propagate inequalities
Teach SCCP to create notconstant lattice values from inequality
assumes and nonnull metadata, and update getConstant() to make
use of them. Additionally isOverdefined() needs to be changed to
consider notconstant an overdefined value.

Handling inequality branches is delayed until our branch on undef
story in other passes has been improved.

Differential Revision: https://reviews.llvm.org/D83643
2020-08-04 20:20:52 +02:00
AK f8cc94a61a Revert "[HotColdSplit] Add test case for unlikely attribute in outlined function"
This reverts commit aa1f905890.

The flag -codegenprepare maybe causing failures. Reverting this
to investigate the root cause.
2020-08-04 11:15:21 -07:00
Sanjay Patel 960cef75f4 [InstSimplify] add tests for compare of min/max; NFC
The test are adapted from the existing tests for cmp/select idioms.
2020-08-04 13:55:30 -04:00
Sanjay Patel 04e45ae1c6 [InstSimplify] fold nested min/max intrinsics with constant operands
This is based on the existing code for the non-intrinsic idioms
in InstCombine.

The vector constant constraint is non-obvious: undefs should be
ok in the outer call, but they can't propagate safely from the
inner call in all cases. Example:

https://alive2.llvm.org/ce/z/-2bVbM
  define <2 x i8> @src(<2 x i8> %x) {
  %0:
    %m = umin <2 x i8> %x, { 7, undef }
    %m2 = umin <2 x i8> { 9, 9 }, %m
    ret <2 x i8> %m2
  }
  =>
  define <2 x i8> @tgt(<2 x i8> %x) {
  %0:
    %m = umin <2 x i8> %x, { 7, undef }
    ret <2 x i8> %m
  }
  Transformation doesn't verify!
  ERROR: Value mismatch

  Example:
  <2 x i8> %x = < undef, undef >

  Source:
  <2 x i8> %m = < #x00 (0)	[based on undef value], #x00 (0) >
  <2 x i8> %m2 = < #x00 (0), #x00 (0) >

  Target:
  <2 x i8> %m = < #x07 (7), #x10 (16) >
  Source value: < #x00 (0), #x00 (0) >
  Target value: < #x07 (7), #x10 (16) >
2020-08-04 08:44:48 -04:00
Sanjay Patel 011e15bea3 [InstSimplify] add tests for min/max with constants; NFC 2020-08-04 08:02:33 -04:00
Juneyoung Lee 998c0efee0 [JumpThreading] Update test freeze.ll; NFC 2020-08-04 20:27:54 +09:00
Juneyoung Lee e734e8286b [JumpThreading] Remove cast's constraint
As discussed in D84949, this removes the constraint to cast since it does not
cause compile time degradation.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D85188
2020-08-04 19:09:25 +09:00
Juneyoung Lee e218da7ff3 [JumpThreading] Add a test for simplification of cast of any op; NFC 2020-08-04 19:08:58 +09:00
David Green 3c7e7d40a9 [BasicAA] Enable -basic-aa-recphi by default
This option was added a while back, to help improve AA around pointer
phi loops. It looks for phi(gep(phi, const), x) loops, checking if x can
then prove more precise aliasing info.

Differential Revision: https://reviews.llvm.org/D82998
2020-08-04 10:43:42 +01:00
Juneyoung Lee 1ea8465337 [JumpThreading] Merge/rename thread-two-bbsN.ll tests; NFC 2020-08-04 17:07:28 +09:00
Juneyoung Lee 6f97103b56 [JumpThreading] Don't limit the type of an operand
Compared to the optimized code with branch conditions never frozen,
limiting the type of freeze's operand causes generation of suboptimal code in
some cases.
I would like to suggest removing the constraint, as this patch does.
If the number of freeze instructions becomes significant, this can be revisited.

Differential Revision: https://reviews.llvm.org/D84949
2020-08-04 16:21:58 +09:00
Fangrui Song e56626e438 [PGO] Move __profc_ and __profvp_ from their own comdat groups to __profd_'s comdat group
D68041 placed `__profc_`,  `__profd_` and (if exists) `__profvp_` in different comdat groups.
There are some issues:

* Cost: one or two additional section headers (`.group` section(s)): 64 or 128 bytes on ELF64.
* `__profc_`,  `__profd_` and (if exists) `__profvp_` should be retained or
  discarded. Placing them into separate comdat groups is conceptually inferior.
* If the prevailing group does not include `__profvp_` (value profiling not
  used) but a non-prevailing group from another translation unit has `__profvp_`
  (the function is inlined into another and triggers value profiling), there
  will be a stray `__profvp_` if --gc-sections is not enabled.
  This has been fixed by 3d6f53018f.

Actually, we can reuse an existing symbol (we choose `__profd_`) as the group
signature to avoid a string in the string table (the sole reason that D68041
could improve code size is that `__profv_` was an otherwise unused symbol which
wasted string table space). This saves one or two section headers.

For a -DCMAKE_BUILD_TYPE=Release -DLLVM_BUILD_INSTRUMENTED=IR build, `ninja
clang lld`, the patch has saved 10.5MiB (2.2%) for the total .o size.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D84723
2020-08-03 20:35:50 -07:00
Sanjay Patel 9e5cf6bde5 [InstSimplify] fold variations of max-of-min with common operand
https://alive2.llvm.org/ce/z/ZtxpZ3
2020-08-03 15:02:46 -04:00