Commit Graph

5 Commits

Author SHA1 Message Date
Roman Lebedev 017e272c3a [Codegen] (X & (C l>>/<< Y)) ==/!= 0 --> ((X <</l>> Y) & C) ==/!= 0 fold
Summary:
This was originally reported in D62818.
https://rise4fun.com/Alive/oPH

InstCombine does the opposite fold, in hope that `C l>>/<< Y` expression
will be hoisted out of a loop if `Y` is invariant and `X` is not.
But as it is seen from the diffs here, if it didn't get hoisted,
the produced assembly is almost universally worse.

Much like with my recent "hoist add/sub by/from const" patches,
we should get almost universal win if we hoist constant,
there is almost always an "and/test by imm" instruction,
but "shift of imm" not so much, so we may avoid having to
materialize the immediate, and thus need one less register.
And since we now shift not by constant, but by something else,
the live-range of that something else may reduce.

Special care needs to be applied not to disturb x86 `BT` / hexagon `tstbit`
instruction pattern. And to not get into endless combine loop.

Reviewers: RKSimon, efriedma, t.p.northover, craig.topper, spatel, arsenm

Reviewed By: spatel

Subscribers: hiraditya, MaskRay, wuzish, xbolva00, nikic, nemanjai, jvesely, wdng, nhaehnle, javed.absar, tpr, kristof.beyls, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62871

llvm-svn: 366955
2019-07-24 22:57:22 +00:00
David Green 2b20ee4110 [ARM] Favour PL/MI over GE/LT when possible
The arm condition codes for GE is N==V (and for LT is N!=V). If the source of
flags cannot set V (overflow), such as a cmp against #0, then we can use the
simpler PL and MI conditions that only check N. As these PL/MI conditions are
simpler than GE/LT, other passes like the peephole optimiser can have a better
time optimising away the redundant CMPs.

The exception is the VSEL instruction, which cannot take the PL code, so there
the transform favours GE.

Differential Revision: https://reviews.llvm.org/D64160

llvm-svn: 365117
2019-07-04 08:58:58 +00:00
Huihui Zhang d16779a732 [ARM] Comply with rules on ARMv8-A thumb mode partial deprecation of IT.
Summary:
When identifing instructions that can be folded into a MOVCC instruction,
checking for a predicate operand is not enough, also need to check for
thumb2 function, with restrict-IT, is the machine instruction eligible for
ARMv8 IT or not.

Notes in ARMv8-A Architecture Reference Manual, section "Partial deprecation of IT"
  https://usermanual.wiki/Pdf/ARM20Architecture20Reference20ManualARMv8.1667877052.pdf

"ARMv8-A deprecates some uses of the T32 IT instruction. All uses of IT that apply to
instructions other than a single subsequent 16-bit instruction from a restricted set
are deprecated, as are explicit references to the PC within that single 16-bit
instruction. This permits the non-deprecated forms of IT and subsequent instructions
to be treated as a single 32-bit conditional instruction."

Reviewers: efriedma, lebedev.ri, t.p.northover, jmolloy, aemerson, compnerd, stoklund, ostannard

Reviewed By: ostannard

Subscribers: ostannard, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63474

llvm-svn: 363739
2019-06-18 20:55:09 +00:00
Roman Lebedev 2e49e8196d [NFC][Codegen] D62818 - also add tests with X being constant
For X86, these may be a 'BT' pattern, and in general, can cause
the transform to deadlock.

llvm-svn: 362494
2019-06-04 11:44:50 +00:00
Roman Lebedev 6dc8ce323e [NFC][Codegen] Add tests for hoisting and-by-const from "logical shift", when then eq-comparing with 0
This was initially reported as: https://reviews.llvm.org/D62818

https://rise4fun.com/Alive/oPH

llvm-svn: 362455
2019-06-03 22:30:18 +00:00