Summary:
Make foldVectorBinop return null if the instruction type is a scalable
vector. It is unclear what, if any, of this function works with scalable
vectors.
Identified by test LLVM.Transforms/InstCombine::nsw.ll
Reviewers: efriedma, david-arm, fpetrogalli, spatel
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79196
In `InstCombiner::visitAdd()`, we have
```
// A+B --> A|B iff A and B have no bits set in common.
if (haveNoCommonBitsSet(LHS, RHS, DL, &AC, &I, &DT))
return BinaryOperator::CreateOr(LHS, RHS);
```
so we should handle such `or`'s here, too.
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().
I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.
Differential Revision: https://reviews.llvm.org/D78882
(X | MaskC) == C --> (X & ~MaskC) == C ^ MaskC
(X | MaskC) != C --> (X & ~MaskC) != C ^ MaskC
We have more analyis for 'and' patterns and already lean this way
in the existing code, so this should be neutral or better in IR.
If this does not do as well in codegen, the problem already exists
and we should fix that based on target costs/heuristics.
http://volta.cs.utah.edu:8080/z/oP3ecL
define void @src(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) {
%or = or i8 %x, %OrC
%eq = icmp eq i8 %or, %C
store i1 %eq, i1* %p0
%ne = icmp ne i8 %or, %C
store i1 %ne, i1* %p1
ret void
}
define void @tgt(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) {
%NotOrC = xor i8 %OrC, -1
%a = and i8 %x, %NotOrC
%NewC = xor i8 %C, %OrC
%eq = icmp eq i8 %a, %NewC
store i1 %eq, i1* %p0
%ne = icmp ne i8 %a, %NewC
store i1 %ne, i1* %p1
ret void
}
As discussed in PR45478:
https://bugs.llvm.org/show_bug.cgi?id=45478
...propagating FMF from the outer (second) call is not correct,
so intersect them instead.
I suspect we could do better (see TODO comment), but mismatched
FMF is probably too rare to care about.
Differential Revision: https://reviews.llvm.org/D78631
While we can do that, it doesn't increase instruction count,
if the old `sub` sticks around then the transform is not only
not a unlikely win, but a likely regression, since we likely
now extended live range and use count of both of the `sub` operands,
as opposed to just the result of `sub`.
As Kostya Serebryany notes in post-commit review in
https://reviews.llvm.org/D68408#1998112
this indeed can degrade final assembly,
increase register pressure, and spilling.
This isn't what we want here,
so at least for now let's guard it with an use check.
Summary:
As we have discussed previously (e.g. in D63992 / D64090 / [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]), `sub` instruction
can almost be considered non-canonical. While we do convert `sub %x, C` -> `add %x, -C`,
we sparsely do that for non-constants. But we should.
Here, i propose to interpret `sub %x, %y` as `add (sub 0, %y), %x` IFF the negation can be sinked into the `%y`
This has some potential to cause endless combine loops (either around PHI's, or if there are some opposite transforms).
For former there's `-instcombine-negator-max-depth` option to mitigate it, should this expose any such issues
For latter, if there are still any such opposing folds, we'd need to remove the colliding fold.
In any case, reproducers welcomed!
Reviewers: spatel, nikic, efriedma, xbolva00
Reviewed By: spatel
Subscribers: xbolva00, mgorny, hiraditya, reames, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68408
The handling of the `returned` attribute in D75815 did miss the case
where the argument is (bit)casted to a different type. This is
explicitly allowed by the language reference and exposed by the
Attributor.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D77977
Summary:
This patch fix the following issues in InstCombiner::visitGetElementPtrInst
1. Skip for scalable type if transformation requires fixed size number of
vector element.
2. Skip for scalable type if transformation relies on compile-time known type
alloc size.
3. Use VectorType::getElementCount when scalable property is used to construct
new VectorType.
4. Use TypeSize::getKnownMinSize when minimal size of a scalable type is valid to determine GEP 'inbounds'.
5. Explicitly call TypeSize::getFixedSize to avoid implicit type conversion to uint64_t.
Reviewers: sdesmalen, efriedma, spatel, ctetreau
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78081
As proposed in D77881, we'll have the related widening operation,
so this name becomes too vague.
While here, change the function signature to take an 'int' rather
than 'size_t' for the scaling factor, add an assert for overflow of
32-bits, and improve the documentation comments.
Revision a1c05fe <https://reviews.llvm.org/rGa1c05fe20f3def1f1be9f50d2adefc6b6f1578ad>
removed bitcast from the list of problematic transformations, however:
%97 = fptrunc ppc_fp128 %2 to double // we need to check ppc_fp128 here to prevent the transformation
%98 = bitcast double %97 to i64 // a1c05fe checks ppc_fp128 at here
%99 = icmp slt i64 %98, 0
%100 = zext i1 %99 to i8
store i8 %100, i8* %7, align 1
so this patch does that. I'm also disabling it in the presence of extend just in case.
I verified separately that the hash of -std::infinity and std::infinity don't match now.
Differential Revision: https://reviews.llvm.org/D77911
Summary:
There are at least three clients for KnownBits calculations:
ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the
common logic should be moved out of these clients and into KnownBits
itself.
This patch does this for AND, OR and XOR calculations by implementing
and using appropriate operator overloads KnownBits::operator& etc.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74060
Now compiler defines 5 sets of constants to represent rounding mode.
These are:
1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.
2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.
3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.
4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.
5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.
The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.
This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.
Differential Revision: https://reviews.llvm.org/D77379
Passing a Value * to CreateCall has to call getPointerElementType
to find the type of the pointer.
In this case we can rely on the fact that Intrinsic::getDeclaration
returns a Function * and use that version of CreateCall.
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: sdesmalen, rriddle, efriedma
Reviewed By: sdesmalen
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77263
Based on the post-commit comments for rG0f56bbc, there might
be a problem with this transform:
(bitcast (fpext/fptrunc X)) to iX) < 0 --> (bitcast X to iY) < 0
...and the ppc_fp128 data type, so conservatively bypass if we
are bitcasting a ppc_fp128.
We might be able to account for endian or other differences to
enable this for PowerPC again if that is useful.
Differential Revision: https://reviews.llvm.org/D77642
As discussed in D76983, that patch can turn a chain of insert/extract
with scalar trunc ops into bitcast+extract and existing instcombine
vector transforms end up creating a shuffle out of that (see the
PhaseOrdering test for an example). Currently, that process requires
at least this sequence: -instcombine -early-cse -instcombine.
Before D76983, the sequence of insert/extract would reach the SLP
vectorizer and become a vector trunc there.
Based on a small sampling of public targets/types, converting the
shuffle to a trunc is better for codegen in most cases (and a
regression of that form is the reason this was noticed). The trunc is
clearly better for IR-level analysis as well.
This means that we can induce "spontaneous vectorization" without
invoking any explicit vectorizer passes (at least a vector cast op
may be created out of scalar casts), but that seems to be the right
choice given that we started with a chain of insert/extract, and the
backend would expand back to that chain if a target does not support
the op.
Differential Revision: https://reviews.llvm.org/D77299
eraseInstFromFunction() adds the operands of the erased instructions,
as those might now be dead as well. However, this is limited to
instructions with less than 8 operands.
This check doesn't make a lot of sense to me. As the instruction
gets removed afterwards, I don't see a potential for anything
overly pathological happening here (as we can only add those
operands to the worklist once). The impact on CTMark is in
the noise. We also have the same code in instruction sinking
and don't limit the operand count there.
Differential Revision: https://reviews.llvm.org/D77325
These are versions of a function that regressed with:
rGf2fbdf76d8d0
That particular problem occurs with an instcombine-simplifycfg-instcombine
sequence, but we can show that it exists within instcombine only with
other variations of the pattern.
This reverts commit f2fbdf76d8.
As noted in the post-commit thread:
https://reviews.llvm.org/rGf2fbdf76d8d0
...this can obscure a min/max pattern where the components
have extra uses. We can show that the problem is independent
of this change with a slightly modified source example, so
this revert just delays/reduces the need to fix the real
problem.
We need to improve our analysis of negation or -- more
generally -- subtraction using patches like D77230 or D68408.
Summary:
Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils
allows Queries and Transform/Utils to use Analysis.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77171
Negation is equivalent to bitwise-not + 1, so try to convert more
subtracts into adds using this relationship:
0 - (A ^ C) => ((A ^ C) ^ -1) + 1 => A ^ ~C + 1
I doubt this will recover the regression noted in rGf2fbdf76d8d0,
but seems like we're going to need to improve here and/or revive D68408?
Alive2 proofs:
http://volta.cs.utah.edu:8080/z/Re5tMUhttp://volta.cs.utah.edu:8080/z/An-uns
Differential Revision: https://reviews.llvm.org/D77230