Commit Graph

4821 Commits

Author SHA1 Message Date
Hirochika Matsumoto a3cffc1150 [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2
https://alive2.llvm.org/ce/z/94yRMN

Fixes #54177

Differential Revision: https://reviews.llvm.org/D122077
2022-03-29 11:30:06 -04:00
Nikita Popov 682ef39b1a [InstCombine] Remove call to getPointerElementType()
This was erroneously re-introduced as part of
bb0b23174e.
2022-03-29 16:52:29 +02:00
Johannes Doerfert bb0b23174e [InstCombineCalls] Optimize call of bitcast even w/ parameter attributes
Before we gave up if a call through bitcast had parameter attributes.
Interestingly, we allowed attributes for the return value already. We
now handle both the same way, namely, we drop the ones that are
incompatible with the new type and keep the rest. This cannot cause
"more UB" than initially present.

Differential Revision: https://reviews.llvm.org/D119967
2022-03-28 20:57:52 -05:00
chenglin.bi 9a53793ab8 [InstCombine] Fold two select patterns into and-or
select (~a | c), a, b -> and a, (or c, b) https://alive2.llvm.org/ce/z/bnDobs
select (~c & b), a, b -> and b, (or a, c) https://alive2.llvm.org/ce/z/k2jJHJ

Differential Revision: https://reviews.llvm.org/D122152
2022-03-28 16:07:55 -04:00
Simon Pilgrim 6a094a6264 [InstCombine] SimplifyDemandedUseBits - remove ashr node if we only demand known sign bits
We already do this for SelectionDAG, but we're missing it here.

Noticed while re-triaging PR21929

Differential Revision: https://reviews.llvm.org/D122340
2022-03-25 15:39:08 +00:00
Sanjay Patel 5dbb53b1b4 [InstCombine] merge shuffled vector negate and multiply
Add the "(0 - X) --> (X * -1)" reverse identity to the list of alternate form binops.

We need a little hack to make the existing logic work because it does not expect to
move constants from op0 to op1, but the code comment hopefully makes that clear.
I don't think there are any other identities like that.

Fixes #54364

Differential Revision: https://reviews.llvm.org/D122390
2022-03-24 10:25:16 -04:00
Dávid Bolvanský 4397504c2d [NFCI] Fix set-but-unused warning in InstCombineAddSub.cpp 2022-03-24 08:33:40 +01:00
chenglin.bi 52f323d0f1 [InstCombine] Fold abs of known negative operand when source is sub
When abs source comes from (x - y), check if a "x > y" dominating
condition exists.

Fixes #54132

Differential Revision: https://reviews.llvm.org/D122013
2022-03-23 15:21:33 -04:00
Sanjay Patel 0fcff69bcb [InstCombine] try to narrow shifted bswap-of-zext (2nd try)
The first attempt at this missed a validity check.
This version includes a test of the narrow source
type for modulo-16-bits.

Original commit message:

This is the IR counterpart to 370ebc9d9a
which provided a bswap narrowing fix for issue #53867.

Here we can be more general (although I'm not sure yet
what would happen for illegal types in codegen - too
rare to worry about?):
https://alive2.llvm.org/ce/z/3-CPfo

This will be more effective if we have moved the shift
after the bswap as proposed in D122010, but it is
independent of that patch.

Differential Revision: https://reviews.llvm.org/D122166
2022-03-23 11:28:37 -04:00
Nathan Chancellor 4e0008dcbe
Revert "[InstCombine] try to narrow shifted bswap-of-zext"
This reverts commit 9e9bda2e8f.

This causes a backend error when building the Linux kernel for arm64.
See https://reviews.llvm.org/D122166 for a simplified reproducer.
2022-03-22 17:32:33 -07:00
Philip Reames 7abefc4222 [instcombine] Fold away memset/memmove from otherwise unused alloca
The motivation for this is that while both memcpyopt and dse will catch this case, both are limited by MSSA's walk back threshold when finding clobbers.  As such, if you have a memcpy of an otherwise dead alloca placed towards the end of a long basic block with lots of other memory instructions, it would be missed.  This is a bit undesirable for such an "obviously" useless bit of code.

As noted in comments, we should probably generalize instcombine's escape analysis peephole (see visitAllocInst) to allow read xor write.  Doing that would subsume this code in a more general way, but is also a more involved change.  For the moment, I went with the easiest fix.
2022-03-22 13:48:48 -07:00
Sanjay Patel ccf8c969c2 [InstCombine] reorder code, fix formatting; NFC
The affected code can be updated to solve #54364,
so make some cosmetic diffs before real changes.
2022-03-22 16:33:01 -04:00
Sanjay Patel 60820e53ec [InstCombine] try to canonicalize logical shift after bswap
When shifting by a byte-multiple:
bswap (shl X, C) --> lshr (bswap X), C
bswap (lshr X, C) --> shl (bswap X), C

This is an IR implementation of a transform suggested in D120648.
The "swaps cancel" test models the motivating optimization from
that proposal.

Alive2 checks (as noted in the other review, we could use
knownbits to handle shift-by-variable-amount, but that can be an
enhancement patch):
https://alive2.llvm.org/ce/z/pXUaRf
https://alive2.llvm.org/ce/z/ZnaMLf

Differential Revision: https://reviews.llvm.org/D122010
2022-03-22 09:10:55 -04:00
Sanjay Patel 9e9bda2e8f [InstCombine] try to narrow shifted bswap-of-zext
This is the IR counterpart to 370ebc9d9a
which provided a bswap narrowing fix for issue #53867.

Here we can be more general (although I'm not sure yet
what would happen for illegal types in codegen - too
rare to worry about?):
https://alive2.llvm.org/ce/z/3-CPfo

This will be more effective if we have moved the shift
after the bswap as proposed in D122010, but it is
independent of that patch.

Differential Revision: https://reviews.llvm.org/D122166
2022-03-22 08:22:30 -04:00
Nikita Popov fc8946fae7 [InstCombine] Remove integer SPF of SPF folds (NFCI)
Now that we canonicalize to intrinsics, these folds should no
longer be needed. Only one fold that also applies to floating-point
min/max is retained.
2022-03-18 10:20:48 +01:00
Andrew Wei 0af3e6a22d [InstCombine] Sink instructions with multiple users in a successor block.
This patch tries to sink instructions when they are only used in a successor block.

This is a further enhancement patch based on Anna's commit:
D109700, which allows sinking an instruction having multiple uses in a single user.

In this patch, sink instructions with multiple users in a single successor block will be supported.
It could fix a known issue from rust:
  https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610

Reviewed By: nikic, reames

Differential Revision: https://reviews.llvm.org/D121585
2022-03-18 11:53:45 +08:00
Nikita Popov 4010a7a5d0 Reapply [InstCombine] Support switch in phi to cond fold
Reapply with an explicit check for multi-edges, as the expected
behavior of multi-edge dominance is unclear (D120811).

-----

For conditional branches, we know the value is i1 0 or i1 1 along
the outgoing edges. For switches we can apply exactly the same
optimization, just with the known values determined by the switch
cases.
2022-03-17 10:03:09 +01:00
Sanjay Patel 598721f866 [InstCombine] try harder to propagate 'nsz' through fneg-of-select
This can be viewed as swapping the select arms:
https://alive2.llvm.org/ce/z/jUvFMJ
...so we don't have the 'nsz' problem with the more general fold.

This unlocks other folds for the motivating fabs example.
This was discussed in issue #38828.
2022-03-15 11:05:29 -04:00
Simon Pilgrim 7e4cf582cf [InstCombine] Add general constant support to eq/ne icmp(add(X,C1),add(Y,C2)) -> icmp(add(X,C1-C2),Y) fold
A further extension for Issue #32161

For eq/ne comparisons - the sign mismatch and bounds constraints are redundant, so if the that fold fails, fallback and just fold the constants directly.

https://alive2.llvm.org/ce/z/cdodNQ

The loop rotation test change looks mostly benign - the backend doesn't seem to suffer? https://gcc.godbolt.org/z/dErMY78To

Differential Revision: https://reviews.llvm.org/D121551
2022-03-15 14:17:38 +00:00
Craig Topper ce78e68261 [InstCombine] Fold select based logic of fcmps with same operands when FMF is present.
If we have a logical and/or in select form and the true/false operand
is an fcmp with poison generating FMF, we won't be able to fold it
to an and/or instruction. This prevents us from optimizing the case
where it is a logical operation of two fcmps with identical operands.

This patch adds explicit checks for this case that doesn't rely on
converting to and/or to do the optimization. It reuses the existing
foldLogicOfFCmps, but adds a new flag to disable the other combine
that is inside that function.

FMF flags from the two FCmps are intersected using the logic added in
D121243. The FIXME has been updated to indicate that we can only use
a union for the non-select form.

This allows us to optimize cases like this from compare-fp-3.c in the
gcc torture suite with fast math.

void
test1 (float x, float y)
{
  if ((x==y) && (x!=y))
    link_error0();
}

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D121323
2022-03-14 14:45:07 -07:00
Sanjay Patel 3491f2f4b0 [InstCombine] replace negated operand in fcmp with 0.0
X (any pred) -X --> X (any pred) 0.0

This works with all FP values and preserves FMF.
Alive2 examples:
https://alive2.llvm.org/ce/z/dj6jhp

This can also create one of the patterns that we match as "fabs"
as shown in one of the test diffs.
2022-03-10 12:53:32 -05:00
Sanjay Patel 9fac110bf7 Revert "[InstCombine] fold fcmp with lossy casted constant"
This reverts commit 9397bdc67e.

This optimization is likely to surprise programmers as seen
in post-commit comments, so we should add a clang warning
first (that is proposed in D121306).
2022-03-10 10:22:22 -05:00
Simon Pilgrim 808d9d260b [InstCombine] Add vector support to icmp(add(X,C1),add(Y,C2)) -> icmp(add(X,C1-C2),Y) fold
As discussed on Issue #32161 this fold can be generalized a lot more than it currently is, but this patch at least adds vector support.

Differential Revision: https://reviews.llvm.org/D121358
2022-03-10 13:30:48 +00:00
Craig Topper f72fe2ef67 [InstCombine] Preserve FMF in foldLogicOfFCmps.
This patch intersects the fast math flags from the two fcmps instead
of dropping them.

I poked at this a bunch with Alive2 for nnan and ninf flags and it seemed
to check out. With the other flags it told me "Couldn't prove the
correctness of the transformation". Not sure if I should just preserve
nnan and ninf?

Reviewed By: spatel, lebedev.ri

Differential Revision: https://reviews.llvm.org/D121243
2022-03-09 09:17:09 -08:00
Sanjay Patel 9397bdc67e [InstCombine] fold fcmp with lossy casted constant
This is noted as a missing clang warning in #54222
(and we should still make that enhancement).

Alive2 proofs:
https://alive2.llvm.org/ce/z/Q8drDq
https://alive2.llvm.org/ce/z/pE6LRt

I don't see a single conversion for all predicates
using "getFCmpCode" logic, so other predicates are
left as a TODO item.
2022-03-08 12:41:12 -05:00
Arnold Schwaighofer dcdc1f29bb InstCombine: Can't fold a phi arg load into the phi if the load is from a swifterror address
`swifterror` addresses are only allowed as operands to load, store, and
calls.

The following transformation is not allowed. It would create a phi with a
`swifterror` address operand.

```
 %addr = alloca swifterror i8*
 br %cond, label %bb1, label %b22

 bb1:
   %val1 = load i8*, i8** %addr
   br exit

 bb2:
   %val2 = load i8*, i8** %addr
   br exit

 exit:
   %val = phi [%val1, %bb1] [%val2, %bb2]
```

=>

```
 %addr = alloca swifterror i8*
 br %cond, label %bb1, label %b22

 bb1:
   br exit

 bb2:
   br exit

 exit:
   %val_addr = phi [%addr, %bb1] [%addr, %bb2]
   %val2 = load i8*, i8** %val_addr
```

rdar://89865485

Differential Revision: https://reviews.llvm.org/D121217
2022-03-08 09:09:51 -08:00
Augie Fackler 5e4c75db3b InstructionCombining: avoid eliding mismatched alloc/free pairs
Prior to this change LLVM would happily elide a call to any allocation
function and a call to any free function operating on the same unused
pointer. This can cause problems in some obscure cases, for example if
the body of operator::new can be inlined but the body of
operator::delete can't, as in this example from jyknight:

    #include <stdlib.h>
    #include <stdio.h>

    int allocs = 0;

    void *operator new(size_t n) {
        allocs++;
        void *mem = malloc(n);
        if (!mem) abort();
        return mem;
    }

    __attribute__((noinline)) void operator delete(void *mem) noexcept {
        allocs--;
        free(mem);
    }

    void deleteit(int*i) { delete i; }
    int main() {
        int*i = new int;
        deleteit(i);
        if (allocs != 0)
          printf("MEMORY LEAK! allocs: %d\n", allocs);
    }

This patch addresses the issue by introducing the concept of an
allocator function family and uses it to make sure that alloc/free
function pairs are only removed if they're in the same family.

Differential Revision: https://reviews.llvm.org/D117356
2022-03-04 10:41:10 -05:00
Craig Topper 608161225e [InstCombine][Analysis] Move getFCmpCode and getPredForFCmpCode to CmpInstAnalysis. NFC
The similar getICmpCode and getPredForICmpCode are already there.
This moves FP for consistency.

I think InstCombine is currently the only user of both.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120754
2022-03-03 09:33:24 -08:00
Nikita Popov c1b9667148 [InstCombine] Support opaque pointers in callee bitcast fold
To make this actually trigger, we also need to check whether the
function types differ, which is a hidden cast under opaque pointers.
The transform is somewhat less relevant there because it is
primarily about pointer bitcasts, but it can also happen with other
bit- or pointer-castable types.

Byval handling is easier with opaque pointers because there is no
need to adjust the byval type, we only need to make sure that it's
still a pointer.
2022-03-03 11:07:39 +01:00
Nikita Popov 6c8adc5054 [InstCombine] Remove unnecessary byval check in callee cast fold
The logic for handling this was fixed in
8d7f118ab2, but the check for byval
on the callee was retained. This resulted in a weird situation
where the transform would work depending on whether the byval
was only on the call or on both the call and the function.
2022-03-03 10:55:14 +01:00
serge-sans-paille 59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Nikita Popov 61580d0949 Reapply [InstCombine] Remove one-use limitation from X-Y==0 fold
This is a recommit without changes. I originally reverted this
due to a significant code-size regression on tramp3d-v4, however
further investigation showed that in the tramp3d-v4 case this
change enables additional optimizations (in particular more
jump threading), which happens to reduce the size of a function
just enough to be eligible for inlining at hot callsites, which
results in the code size increase. As such, this was just bad
luck.

-----

This one-use limitation is artificial, we do not increase
instruction count if we perform the fold with multiple uses. The
motivating case is shown in @sub_eq_zero_select, where the one-use
limitation causes us to miss a subsequent select fold.

I believe the backend is pretty good about reusing flag-producing
subs for cmps with same operands, so I think doing this is fine.

Differential Revision: https://reviews.llvm.org/D120337
2022-03-02 16:43:33 +01:00
Nikita Popov 5cf06d10f8 Revert "[InstCombine] Support switch in phi to cond fold"
This reverts commit 0817ce86b5.

Seeing some ppc64le stage2 failures, reverting to investigate.
2022-03-02 12:49:47 +01:00
Nikita Popov 0817ce86b5 [InstCombine] Support switch in phi to cond fold
For conditional branches, we know the value is i1 0 or i1 1 along
the outgoing edges. For switches we can apply exactly the same
optimization, just with the known values determined by the switch
cases.
2022-03-02 12:16:32 +01:00
serge-sans-paille a494ae43be Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output:
before: 1065307662
after:  1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741
2022-03-01 21:00:07 +01:00
Craig Topper 7bc6667845 [Analysis] Simplify the interface to llvm::getICmpCode. NFC
Instead of passing an InstCmpInt * and a bool just pass the predicate
from the caller.

I'm considering moving the similar FCmp functions from InstCombine
over here and this makes the interface consistent with what is used
for FCmp.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120609
2022-03-01 09:53:27 -08:00
Nikita Popov a1f442b278 [InstCombine] Support phi to cond fold with more than two preds
This transform can still be applied if there are more than two
phi inputs, as long as phi inputs with the same value are dominated
by the same idom edge.
2022-03-01 16:31:49 +01:00
Nikita Popov 26748bb15a [InstCombine] Slightly relax one-use check in abs canonicalization
Treat the icmp and sub symmetrically, and require that one of them
has one use, not the icmp in particular. This could be further
relaxed in the abs (but not nabs) case to not check one-use at
all.
2022-03-01 15:06:41 +01:00
Sanjay Patel 84812b9b07 [InstCombine] drop FMF in select->copysign transform
It is not correct to propagate flags from the select
to the new instructions:
https://alive2.llvm.org/ce/z/tNATrd
https://alive2.llvm.org/ce/z/VwcVzn

Fixes #54077
2022-03-01 08:51:41 -05:00
Nikita Popov c2428a4fad [InstCombine] Remove SPF min/max check from select demanded bits (NFCI)
This should no longer be necessary now that we canonicalize to
intrinsics. This may not be entirely NFC in practice if worklist
order gets inverted and we perform demanded bits simplification
of a select user before the select is canonicalized.
2022-03-01 14:50:37 +01:00
Sanjay Patel 278b407a30 [InstCombine] fold mul-with-overflow intrinsic with -1 operand
extractvalue (any_mul_with_overflow X, -1), 0 --> -X

There are similar other potential transforms that we could do as
noted by the last TODO in the test diffs.

Fixes #54053
2022-02-28 14:13:48 -05:00
Sanjay Patel f422c5d871 [InstCombine] fold select-of-zero-or-ones with negated op
(X u< 2) ? -X : -1 --> sext (X != 0)
(X u> 1) ? -1 : -X --> sext (X != 0)

https://alive2.llvm.org/ce/z/U3y5Bb
https://alive2.llvm.org/ce/z/hgi-4p

This is part of solving:
2022-02-28 12:07:49 -05:00
Nikita Popov 5423b0a525 [InstCombine] Remove not of SPF min/max fold (NFCI)
This should no longer be necessary now that we canonicalize to
intrinsics. Might not be strictly NFC due to worklist order.
2022-02-28 11:02:31 +01:00
Nikita Popov d5ea3b2f33 [InstCombine] Remove sub of SPF min/max fold (NFCI)
This isn't necessary anymore, now that we canonicalize SPF min/max
to intrinsics. Might not be strictly NFC due to worklist order
changes.
2022-02-28 10:57:24 +01:00
Nikita Popov 9353ed6a53 [InstCombine] Don't call matchSAddSubSat() for SPF (NFC)
Only call it for intrinsic min/max. The moved implementation is
unchanged apart from the one-use check: It is now hardcoded to
one-use, without the two-use special case for SPF.
2022-02-28 10:41:56 +01:00
Nikita Popov 53602e4c70 [InstCombine] Remove SPF moveAddAfterMinMax() (NFC)
As SPF min/max is canonicalized to intrinsics before this point,
this change should be entirely NFC.
2022-02-28 10:28:16 +01:00
Nikita Popov ee62dcdb34 [InstCombine] Remove SPF moveNotAfterMinMax() (NFC)
This happens after SPF -> intrinsic canonicalization, and as such
should be entirely NFC.
2022-02-28 10:23:07 +01:00
Nikita Popov 0bc3e233d7 [InstCombine] Remove SPF factorizeMinMaxTree() (NFC)
SPF integer min/max is canonicalized to min/max intrinsics before
this code is reached, so this should be entirely NFC.
2022-02-28 10:22:05 +01:00
Nikita Popov e1608a9df8 [InstCombine] Remove SPF min/max canonicalization
Now that we canonicalize SPF min/max to intrinsics, there's no
need to canonicalize the structure of the SPF min/max itself
anymore. This is conceptually NFC, but in practice does slightly
impact results due to folding order differences.
2022-02-25 11:24:09 +01:00
Sanjay Patel 5379f76e63 [InstCombine] try harder to preserve 'nsz' in fneg-of-select transform
The corner case where 'nsz' needs to be removed is very narrow
as discussed here:
https://reviews.llvm.org/rG3cdd05e519dd

If the select condition is not undef, there's no problem with
propagating 'nsz':
https://alive2.llvm.org/ce/z/4GWJdq
2022-02-24 10:43:53 -05:00