Add (opt-in) support for implicit truncation to isConstOrConstSplat, which allows us to match truncated 'all ones' cases in isBitwiseNot.
PR41020 compares against using ISD::isBuildVectorAllOnes() instead, but that predicate silently accepts any UNDEF elements in the build vector which might not be what we want in isBitwiseNot - so I've added an opt-in 'AllowUndefs' flag that is set to false by default but will allow us to enable it on individual cases where its safe.
Differential Revision: https://reviews.llvm.org/D62783
llvm-svn: 362323
Fold (x & ~y) | y and it's four commuted variants to x | y. This pattern
can in particular appear when a vselect c, x, -1 is expanded to
(x & ~c) | (-1 & c) and combined to (x & ~c) | c.
This change has some overlap with D59066, which avoids creating a
vselect of this form in the first place during uaddsat expansion.
Differential Revision: https://reviews.llvm.org/D59174
llvm-svn: 356333
Switch BIC immediate creation for vector ANDs from custom lowering
to a DAG combine, which gives generic DAG combines a change to
apply first. In particular this avoids (and x, -1) being turned into
a (bic x, 0) instead of being eliminated entirely.
Differential Revision: https://reviews.llvm.org/D59187
llvm-svn: 356299
Summary:
AArch64 can fold some shift+extend operations on the RHS operand of
comparisons, so swap the operands if that makes sense.
This provides a fix for https://bugs.llvm.org/show_bug.cgi?id=38751
Reviewers: efriedma, t.p.northover, javed.absar
Subscribers: mcrosier, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D53067
llvm-svn: 344439
This is a preliminary step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
If we have an 'add' instruction that sets flags, we can use that to eliminate an
explicit compare instruction or some other instruction (cmn) that sets flags for
use in the later select.
As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively
reversing an IR icmp canonicalization that replaces a variable operand with a
constant:
https://rise4fun.com/Alive/V1Q
But we're not using 'uaddo' in those cases via DAG transforms. This happens in
CGP after D8889 without checking target lowering to see if the op is supported.
So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with
"using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned
saturated add and converts to uaddo without checking target capabilities.
This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only
see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title
(unlike x86 which sees improvements for all sizes because all sizes are 'custom').
But the AArch code (like x86) looks better when translated to 'uaddo' in all cases.
So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO,
so this patch will fire on those tests.
Another possibility given the existing behavior: we could remove the legal-or-custom
check altogether because we're assuming that a UADDO sequence is canonical/optimal
before we ever reach here. But that seems like a bug to me. If the target doesn't
have an add-with-flags op, then it's not likely that we'll get optimal DAG combining
using a UADDO node. This is similar justification for why we don't canonicalize IR to
the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first
place.
Differential Revision: https://reviews.llvm.org/D51929
llvm-svn: 342886