Commit Graph

10258 Commits

Author SHA1 Message Date
Matt Arsenault 3922392969 AMDGPU: Correct behavior of f16 buffer loads
Don't assume format loads for f16. Also fixes support for targets
without i16.

llvm-svn: 367879
2019-08-05 15:59:07 +00:00
Guillaume Chatelet c97a3d15d2 [LLVM][Alignment] Introduce Alignment Type
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jfb, jakehehrlich

Reviewed By: jfb

Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65514

llvm-svn: 367828
2019-08-05 11:02:05 +00:00
Oliver Stannard 8ed8353fc4 Reland: Fix and test inter-procedural register allocation for ARM
Add an explicit construction of the ArrayRef, gcc 5 and earlier don't
seem to select the ArrayRef constructor which takes a C array when the
construction is implicit.

Original commit message:

- Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves
  with a null RegScavenger. Simply not updating the register scavenger
  is fine because IPRA only cares about the SavedRegs vector, the acutal
  code of the function has already been generated at this point.
- Add a new hook to TargetRegisterInfo to get the set of registers which
  can be clobbered inside a call, even if the compiler can see both
  sides, by linker-generated code.

Differential revision: https://reviews.llvm.org/D64908

llvm-svn: 367819
2019-08-05 09:04:10 +00:00
David Green 91296295d0 [ARM] MVE big endian bitcasts
This adds big endian MVE patterns for bitcasts. They are defined in llvm as
being the same as a store of the existing type and the load into the new. This
means that they have to become a VREV between the two types, working in the
same way that NEON works in big-endian. This also adds some example tests for
bigendian, showing where code is and isn't different.

The main difference, especially from a testing perspective is that vectors are
passed as v2f64, and so are VREV into and out of call arguments, and the
parameters are passed in a v2f64 format. Same happens for inline assembly where
the register class is used, so it is VREV to a v16i8.

So some of this is probably not correct yet, but it is (mostly) self-consistent
and seems to be consistent with how llvm treats vectors. The rest we can
hopefully fix later. More details about big endian neon can be found in
https://llvm.org/docs/BigEndianNEON.html.

Differential Revision: https://reviews.llvm.org/D65581

llvm-svn: 367780
2019-08-04 10:18:15 +00:00
Nikita Popov 4f8259bdbc [Thumb] Fix invalid symbol redefinition due to duplicated jumptable (PR42760)
Fix for https://bugs.llvm.org/show_bug.cgi?id=42760. A tBR_JTr
instruction is duplicated by tail duplication, which results in
the same jumptable with the same label being emitted twice.

Fix this by marking tBR_JTr as not duplicable. The corresponding
ARM/Thumb instructions are already marked as not duplicable.
Additionally also mark tTBB_JT and tTBH_JT to be consistent with
Thumb2, even though this shouldn't be strictly necessary.

Differential Revision: https://reviews.llvm.org/D65606

llvm-svn: 367753
2019-08-03 06:47:23 +00:00
Bill Wendling 41a2847a9a Emit diagnostic if an inline asm constraint requires an immediate
Summary:
An inline asm call can result in an immediate after inlining. Therefore emit a
diagnostic here if constraint requires an immediate but one isn't supplied.

Reviewers: joerg, mgorny, efriedma, rsmith

Reviewed By: joerg

Subscribers: asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, MaskRay, jyknight, dylanmckay, javed.absar, fedor.sergeev, jrtc27, Jim, krytarowski, eraman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60942

llvm-svn: 367750
2019-08-03 05:52:47 +00:00
Douglas Yung 42618b270d Revert Fix and test inter-procedural register allocation for ARM
This reverts r367669 (git commit f6b00c279a)

This was breaking a build bot http://lab.llvm.org:8011/builders/netbsd-amd64/builds/21233

llvm-svn: 367731
2019-08-02 22:11:49 +00:00
Tim Northover 522fb7eedc GlobalISel: support swiftself attribute
llvm-svn: 367683
2019-08-02 14:09:49 +00:00
Oliver Stannard 4b7239ebac [IPRA][ARM] Disable no-CSR optimisation for ARM
This optimisation isn't generally profitable for ARM, because we can
save/restore many registers in the prologue and epilogue using the PUSH
and POP instructions, but mostly use individual LDR/STR instructions for
other spills.

Differential revision: https://reviews.llvm.org/D64910

llvm-svn: 367670
2019-08-02 10:23:17 +00:00
Oliver Stannard f6b00c279a Fix and test inter-procedural register allocation for ARM
- Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves
  with a null RegScavenger. Simply not updating the register scavenger
  is fine because IPRA only cares about the SavedRegs vector, the acutal
  code of the function has already been generated at this point.
- Add a new hook to TargetRegisterInfo to get the set of registers which
  can be clobbered inside a call, even if the compiler can see both
  sides, by linker-generated code.

Differential revision: https://reviews.llvm.org/D64908

llvm-svn: 367669
2019-08-02 10:23:05 +00:00
Sam Parker cd38599275 [NFC][ARM[ParallelDSP] Rename/remove/change types
Remove forward declaration, fold a couple of typedefs and change one
to be more useful.

llvm-svn: 367665
2019-08-02 08:21:17 +00:00
Sam Parker 14c6dfdfe2 [NFC][ARM][ParallelDSP] Remove ValueList
We only care about the first element in the list.

llvm-svn: 367660
2019-08-02 07:32:28 +00:00
Daniel Sanders 2bea69bf65 Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC
llvm-svn: 367633
2019-08-01 23:27:28 +00:00
David Green 1343814fb4 [ARM] Fix for MVE VREV64
The VREV64 instruction is apparently unpredictable if Qd == Qm, due to the
cross-beat nature of the instruction. This adds an earlyclobber to Qd, which
seems to be the same way we deal with this on other instructions like the
write-back on loads and stores.

Differential Revision: https://reviews.llvm.org/D65502

llvm-svn: 367544
2019-08-01 11:22:03 +00:00
Sam Parker 7ca8c6f6db [NFC][ARM][ParallelDSP] Getters and renaming
Add a couple of getters for Reduction and do some renaming of
variables around CreateSMLAD for clarity.

llvm-svn: 367522
2019-08-01 08:17:51 +00:00
Eli Friedman 89b80f1239 [ARM] Lower "(x<<c) > 0x80000000U" to "lsls" on Thumb1.
This is extremely specific, but saves three instructions when it's
legal.  I don't think the code can be usefully generalized.

Differential Revision: https://reviews.llvm.org/D65351

llvm-svn: 367492
2019-07-31 23:19:21 +00:00
Eli Friedman 2f45ec1c39 [ARM] Transform compare of masked value to shift on Thumb1.
Thumb1 has very limited immediate modes, so turning an "and" into a
shift can save multiple instructions.

It's possible to simplify the generated code for test2 and test3 in
cmp-and-fold.ll a little more, but I'll implement that as a followup.

Differential Revision: https://reviews.llvm.org/D65175

llvm-svn: 367491
2019-07-31 23:17:34 +00:00
Mark Lacey 641ea2e701 [GISel] Address review feedback on passing MD_callees to lowerCall.
Preserve the nullptr default for KnownCallees that appears in
the base class.

llvm-svn: 367477
2019-07-31 20:34:05 +00:00
Mark Lacey 7b8d3eb9e2 [GISel] Pass MD_callees metadata down in call lowering.
Summary:
This will make it possible to improve IPRA by taking into account
register usage in indirect calls.

NFC yet; this is just laying the groundwork to start building
up patches to take advantage of the information for improved register
allocation.

Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette

Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65488

llvm-svn: 367476
2019-07-31 20:34:02 +00:00
Mikhail Maltsev 806231ecc3 [ARM] Reject CSEL instructions with invalid operands
Summary:
According to the Armv8.1-M manual CSEL, CSINC, CSINV and CSNEG are
"constrained unpredictable" when SP is used as the source register Rn.

The assembler should diagnose this case.

Reviewers: momchil.velikov, dmgreen, ostannard, simon_tatham, t.p.northover

Reviewed By: ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65505

llvm-svn: 367433
2019-07-31 14:22:45 +00:00
Oliver Cruickshank 09a1b8172b [ARM] Generate MVE VFMAs
llvm-svn: 367408
2019-07-31 10:44:11 +00:00
Oliver Cruickshank e7241e8592 [NFC] Test Commit
llvm-svn: 367405
2019-07-31 10:08:09 +00:00
Sam Parker 5ea07f7c07 [NFC][ARMCGP] Use switch in isSupportedValue
Use a switch instead of many isa<> while checking for supported
values. Also be explicit about which cast instructions are supported;
This allows the removal of SIToFP from GenerateSignBits.

llvm-svn: 367402
2019-07-31 09:34:11 +00:00
Sam Parker 2200a9bdf3 [ARM][ParallelDSP] Convert to function pass
Run across a whole function, visiting each basic block one at a time.

Differential Revision: https://reviews.llvm.org/D65324

llvm-svn: 367389
2019-07-31 07:32:03 +00:00
Sam Parker e3a4a13fcc [ARM][LowOverheadLoops] Enable by default
The code is now in a good enough state to pass the bunch of tests that
I have run (after fixing the bugs), so let's enable it by default.

Differential Revision: https://reviews.llvm.org/D65277

llvm-svn: 367297
2019-07-30 08:14:28 +00:00
Sam Parker ed2ea3e46b [ARM][LowOverheadLoops] Revert non-header LE target
Revert the hardware loop upon finding a LoopEnd that doesn't target
the loop header, instead of asserting a failure.

Differential Revision: https://reviews.llvm.org/D65268

llvm-svn: 367296
2019-07-30 08:08:44 +00:00
Sam Parker 414dd1c946 [NFC][ARM[ParallelDSP] Cleanup of BinOpChain
- Remove some unused typedefs.
- Rename BinOpChain struct to MulCandidate.
- Remove the size method of MulCandidate.
- Store only the first input of the ValueList provided to
  MulCandidate, as it's the only value we care about. This means we
  don't have to perform any ugly (and unnecessary) iterations of the 
  list later on.

llvm-svn: 367208
2019-07-29 08:41:51 +00:00
Sam Parker 8538060103 [NFC][ARM][ParallelDSP] Remove AreSymmetrical
We explicitly search for a parallel mac and we only care about its
inputs, checking for symmetry doesn't add anything here.

llvm-svn: 367205
2019-07-29 08:12:24 +00:00
Sam Parker 11ad33ede6 [NFC][ARM][ParallelDSP] Remove PopulateLoads
We no longer have to check what loads are used, all this
is performed at the start of the transform, so it's not
doing anything now.

llvm-svn: 367204
2019-07-29 08:07:23 +00:00
David Green b8b8b46a51 [ARM] MVE VPNOT
This adds the patterns required to transform xor P0, -1 to a VPNOT. The
instruction operands have to change a little for this, adding an in and an out
VCCR reg and using a custom DecodeMVEVPNOT for the decode.

Differential Revision: https://reviews.llvm.org/D65133

llvm-svn: 367192
2019-07-28 14:07:48 +00:00
David Green 9cf344e739 [ARM] Better patterns for fp <> predicate vectors
These are some better patterns for converting between predicates and floating
points. Much like the extends, we select "1"/"-1" or "0" depending on the
predicate value. Or we perform a compare against 0 to convert to a predicate.

Differential Revision: https://reviews.llvm.org/D65103

llvm-svn: 367191
2019-07-28 13:53:39 +00:00
Sam Parker 3da59e5513 [ARM][ParallelDSP] Combine structs
Combine OpChain and BinOpChain structs as OpChain is a base class to
BinOpChain that is never used.

llvm-svn: 367114
2019-07-26 14:11:40 +00:00
Sam Parker 7440065bd8 [NFC][ARM][ParallelDSP] Cleanup isNarrowSequence
Remove unused logic.

llvm-svn: 367099
2019-07-26 10:57:42 +00:00
Sam Parker c760b5da11 [ARM][LowOverheadLoops] Add CPSR defs
Both WhileLoopStart and LoopEnd may get turned into a cmp and br pair,
so add an implicit def to these pseudo instructions in case that WLS
and LE aren't generated.

Differential Revision: https://reviews.llvm.org/D65275

llvm-svn: 367089
2019-07-26 08:15:01 +00:00
Pablo Barrio 275954539d [ARM][AArch64] Support for Cortex-A65 & A65AE, Neoverse E1 & N1
Summary:
Add support for Cortex-A65, Cortex-A65AE, Neoverse E1 and Neoverse N1.
Neoverse E1 and Cortex-A65(&AE) only implement the AArch64 state of the
Arm architecture. Neoverse N1 implements both AArch32 and AArch64.

Cortex-A65:
https://developer.arm.com/ip-products/processors/cortex-a/cortex-a65

Cortex-A65AE:
https://developer.arm.com/ip-products/processors/cortex-a/cortex-a65ae

Neoverse E1:
https://developer.arm.com/ip-products/processors/neoverse/neoverse-e1

Neoverse N1:
https://developer.arm.com/ip-products/processors/neoverse/neoverse-n1

Patch by Diogo Sampaio and Pablo Barrio

Reviewers: samparker, LukeCheeseman, sbaranga, ostannard

Reviewed By: ostannard

Subscribers: ostannard, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64406

llvm-svn: 367007
2019-07-25 10:59:45 +00:00
Eli Friedman 82e109279d [ARM] Remove dead code from ARMConstantIslands.
tLDRHi is not a pc-relative load; it can't directly refer to a
constant pool or jump table.

llvm-svn: 366963
2019-07-24 23:36:14 +00:00
David Green cd7a6fa314 [ARM] Rewrite how VCMP are lowered, using a single node
This removes the VCEQ/VCNE/VCGE/VCEQZ/etc nodes, just using two called VCMP and
VCMPZ with an extra operand as the condition code. I believe this will make
some combines simpler, allowing us to just look at these codes and not the
operands. It also helps fill in a missing VCGTUZ MVE selection without adding
extra nodes for it.

Differential Revision: https://reviews.llvm.org/D65072

llvm-svn: 366934
2019-07-24 17:36:47 +00:00
David Green 047a0b6575 [ARM] Disable MVE fptosi and friends
The prevents us from trying to convert an i1 predicate vector to a float, or
vice-versa. Better patterns are possible, which will follow in a subsequent
commit. For now we just expand them.

Differential Revision: https://reviews.llvm.org/D65066

llvm-svn: 366931
2019-07-24 17:26:26 +00:00
David Green b342bddbe2 [ARM] More MVE compare vector splat combines for ANDs
Adds some extra r register compare combines, this time for ANDs.

Differential Revision: https://reviews.llvm.org/D65062

llvm-svn: 366928
2019-07-24 17:08:09 +00:00
David Green 93b5f61295 [ARM] MVE compare vector splat combine
MVE VCMP instructions can use a general purpose register as the second operand.
This adds the combines for it, selecting from a compare of a vdup.

Differential Revision: https://reviews.llvm.org/D65061

llvm-svn: 366924
2019-07-24 16:58:41 +00:00
David Green bab4d8ac5a [ARM] Better OR's for MVE compares
This adds a DeMorgan combine for OR's of compares to turn them into AND's,
helping prevent them from going into and out of gpr registers. It also fills in
the VCLE and VCLT nodes that MVE can select, allowing it to invert more
compares.

Differential Revision: https://reviews.llvm.org/D65059

llvm-svn: 366920
2019-07-24 16:42:09 +00:00
David Green 69fba7434e [ARM] Better AND's for MVE compares
Add a number of folds to convert and(vcmp, vcmp) into a single VPT block, where
the second vcmp becomes predicated on the first.

The VCMP; VPST; VCMP will eventually be converted to VPT; VCMP in the
VPTBlockPass.

Differential Revision: https://reviews.llvm.org/D65058

llvm-svn: 366910
2019-07-24 14:42:05 +00:00
David Green 4fc78c496e [ARM] MVE floating point compares and selects
Much like integers, this adds MVE floating point compares and select. It
requires a lot more buildvector/shuffle code because we may need to expand the
compares without mve.fp, and requires support for and/or because of the way we
lower llvm condition codes.

Some original code by David Sherwood

Differential Revision: https://reviews.llvm.org/D65054

llvm-svn: 366909
2019-07-24 14:28:22 +00:00
David Green a4a4698c16 [ARM] Basic And/Or/Xor handling for MVE predicates
This adds some basic, "worst case" handling for MVE predicate Or/And/Xor. It
does this by going into and out of GPRs, doing the operation on scalars.

Code by David Sherwood.

Differential Revision: https://reviews.llvm.org/D65053

llvm-svn: 366907
2019-07-24 14:17:54 +00:00
Simi Pallipurath 724888af45 [ARM] Make sure that the constant pool does not keep in the middle of an IT block.
This change make sure that llvm does not emit an invalid IT block
by putting the constant pool in the middle of an IT block.

We have code to try to avoid putting a constant island in the middle of an
IT block, but it only works if we see an IT between the one currently
referencing CPE and possible insertion point. If the first instruction
we look at is the VLDRD after the IT , we never see the IT and does not
realize that the instruction doing the load could be in an IT block itself.

Differential Revision: https://reviews.llvm.org/D64621

Change-Id: I24cecb37cded75e8992870bd997f6226853bd920
llvm-svn: 366905
2019-07-24 13:54:14 +00:00
Sjoerd Meijer a19f5a76e6 Test commit. NFC.
Removed 2 trailing whitespaces in 2 files that used to be in different
repos to test my new github monorepo workflow.

llvm-svn: 366904
2019-07-24 13:30:36 +00:00
David Green c7e55d4f52 [ARM] MVE predicate register support
This adds support code for building and shuffling i1 predicate registers. It
generally uses two basic principles, either converting the predicate into an
scalar (through a PREDICATE_CAST) and doing scalar operations on it there, or
by converting the register to an full vector register and back.

Some of the code here is a not super efficient but will hopefully cover most
cases of moving i1 vectors around and can be improved in subsequent patches.

Some code by David Sherwood.

Differential Revision: https://reviews.llvm.org/D65052

llvm-svn: 366890
2019-07-24 11:51:36 +00:00
David Green b9d96ceca0 [ARM] MVE integer compares and selects
This adds the very basics for MVE vector predication, adding integer VCMP and
VSEL instruction support. This is done through predicate registers (MVT::v16i1,
MVT::v8i1, MVT::v4i1), but otherwise using same mechanics as NEON to custom
lower setcc's through ARMISD::VCXX nodes (VCEQ, VCGT, VCEQZ, etc).

An extra VCNE was added, as this can be handled sensibly by MVE's expanded
number of VCMP condition codes. (There are also VCLE and VCLT which are added
later).

VPSEL is also added here, simply selecting on the vselect.

Original code by David Sherwood.

Differential Revision: https://reviews.llvm.org/D65051

llvm-svn: 366885
2019-07-24 11:08:14 +00:00
Sam Parker aeb21b96a0 [ARM][ParallelDSP] Fix pointer operand reordering
While combining two loads into a single load, we often need to
reorder the pointer operands for the new load. This reordering was
broken in the cases where there was a chain of values that built up
the pointer.

Differential Revision: https://reviews.llvm.org/D65193

llvm-svn: 366881
2019-07-24 09:38:39 +00:00
Eli Friedman b27fc95e89 [ARM] Add opt-bisect support to ARMParallelDSP.
llvm-svn: 366851
2019-07-23 20:48:46 +00:00
Sam Parker 57e87dd81b [ARM][LowOverheadLoops] Fix branch target codegen
While lowering test.set.loop.iterations, it wasn't checked how the
brcond was using the result and so the wls could branch to the loop
preheader instead of not entering it. The same was true for
loop.decrement.reg.
    
So brcond and br_cc and now lowered manually when using the hwloop
intrinsics. During this we now check whether the result has been
negated and whether we're using SETEQ or SETNE and 0 or 1. We can
then figure out which basic block the WLS and LE should be targeting.

Differential Revision: https://reviews.llvm.org/D64616

llvm-svn: 366809
2019-07-23 14:08:46 +00:00
David Green fdedf240f8 [ARM] Rename NEONModImm to VMOVModImm. NFC
Rename NEONModImm to VMOVModImm as it is used in both NEON and MVE.

llvm-svn: 366790
2019-07-23 09:19:24 +00:00
Sam Parker 4379a40088 [ARM][LowOverheadLoops] Revert remaining pseudos
ARMLowOverheadLoops would assert a failure if it did not find all the
pseudo instructions that comprise the hardware loop. Instead of doing
this, iterate through all the instructions of the function and revert
any remaining pseudo instructions that haven't been converted.

Differential Revision: https://reviews.llvm.org/D65080

llvm-svn: 366691
2019-07-22 14:16:40 +00:00
David Green 8876a312a8 [ARM] Fix for MVE VPT block pass
We need to ensure that the number of T's is correct when adding multiple
instructions into the same VPT block.

Differential revision: https://reviews.llvm.org/D65049

llvm-svn: 366684
2019-07-22 12:51:38 +00:00
Oliver Stannard 6771a89fa0 [IPRA][ARM] Make use of the "returned" parameter attribute
ARM has code to recognise uses of the "returned" function parameter
attribute which guarantee that the value passed to the function in r0
will be returned in r0 unmodified. IPRA replaces the regmask on call
instructions, so needs to be told about this to avoid reverting the
optimisation.

Differential revision: https://reviews.llvm.org/D64986

llvm-svn: 366669
2019-07-22 08:44:36 +00:00
Mikhail Maltsev 0b001f94a5 [ARM] Add <saturate> operand to SQRSHRL and UQRSHLL
Summary:
According to the new Armv8-M specification
https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf the
instructions SQRSHRL and UQRSHLL now have an additional immediate
operand <saturate>. The new assembly syntax is:

SQRSHRL<c> RdaLo, RdaHi, #<saturate>, Rm
UQRSHLL<c> RdaLo, RdaHi, #<saturate>, Rm

where <saturate> can be either 64 (the existing behavior) or 48, in
that case the result is saturated to 48 bits.

The new operand is encoded as follows:
  #64 Encoded as sat = 0
  #48 Encoded as sat = 1
sat is bit 7 of the instruction bit pattern.

This patch adds a new assembler operand class MveSaturateOperand which
implements parsing and encoding. Decoding is implemented in
DecodeMVEOverlappingLongShift.

Reviewers: ostannard, simon_tatham, t.p.northover, samparker, dmgreen, SjoerdMeijer

Reviewed By: simon_tatham

Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64810

llvm-svn: 366555
2019-07-19 09:46:28 +00:00
Diogo N. Sampaio 11512e742b [ARM][DAGCOMBINE][FIX] PerformVMOVRRDCombine
Summary:
PerformVMOVRRDCombine ommits adding a offset
of 4 to the PointerInfo, when converting a
f64 = load[M]
to
{i32, i32} = {load[M], load[M + 4]}

Which would allow the machine scheduller
to break dependencies with the second load.

 - pr42638

Reviewers: eli.friedman, dmgreen, ostannard

Reviewed By: ostannard

Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64870

llvm-svn: 366423
2019-07-18 10:05:56 +00:00
Diana Picus 37e403d18c [ARM GlobalISel] Cleanup CallLowering. NFC
Migrate CallLowering::lowerReturnVal to use the same infrastructure as
lowerCall/FormalArguments and remove the now obsolete code path from
splitToValueTypes.

Forgot to push this earlier.

llvm-svn: 366308
2019-07-17 10:01:27 +00:00
Kyrylo Tkachov a3e26d1a6c [NFC] Test commit: add full stop at end of comment
llvm-svn: 366195
2019-07-16 09:15:01 +00:00
Rui Ueyama 49a3ad21d6 Fix parameter name comments using clang-tidy. NFC.
This patch applies clang-tidy's bugprone-argument-comment tool
to LLVM, clang and lld source trees. Here is how I created this
patch:

$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ mkdir build
$ cd build
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \
    -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \
    -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \
    -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm
$ ninja
$ parallel clang-tidy -checks='-*,bugprone-argument-comment' \
    -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \
    ::: ../llvm/lib/**/*.{cpp,h} ../clang/lib/**/*.{cpp,h} ../lld/**/*.{cpp,h}

llvm-svn: 366177
2019-07-16 04:46:31 +00:00
David Green dc56995c57 [ARM] MVE vector for 64bit types
We need to make sure that we are sensibly dealing with vectors of types v2i64
and v2f64, even if most of the time we cannot generate native operations for
them. This mostly adds a lot of testing, plus fixes up a couple of the issues
found. And, or and xor can be legal for v2i64, and shifts combining needs a
slight fixup.

Differential Revision: https://reviews.llvm.org/D64316

llvm-svn: 366106
2019-07-15 18:42:54 +00:00
David Green 8e7eee617a [ARM] Minor formatting in ARMInstrMVE.td. NFC
llvm-svn: 366089
2019-07-15 17:29:06 +00:00
David Green 6e89887642 [ARM] MVE Vector Shifts
This adds basic lowering for MVE shifts. There are many shifts in MVE, but the
instructions handled here are:
 VSHL (imm)
 VSHRu (imm)
 VSHRs (imm)
 VSHL (vector)
 VSHL (register)

MVE, like NEON before it, doesn't have shift right by a vector (or register).
We instead have to negate the amount and shift in the opposite direction. This
means we have to convert any SHR's into a form of SHL (that is still signed or
unsigned) with a negated condition and selecting from there. MVE still does
have shifting by an immediate for SHL, ASR and LSR.

This adds lowering for these and for register forms, which work well for shift
lefts but may require an extra fold of neg(vdup(x)) -> vdup(neg(x)) to potentially
work optimally for right shifts.

Differential Revision: https://reviews.llvm.org/D64212

llvm-svn: 366056
2019-07-15 11:35:39 +00:00
David Green f059147a10 [ARM] Move Shifts after Bits. NFC
This just moves the shift instruction definitions further down the
ARMInstrMVE.td file, to make positioning patterns slightly more natural.

llvm-svn: 366054
2019-07-15 11:22:05 +00:00
David Green da750b1688 [ARM] Adjust how NEON shifts are lowered
This adjusts the way that we lower NEON shifts to use a DAG target node, not
via a neon intrinsic. This is useful for handling MVE shifts operations in the
same the way. It also renames some of the immediate shift nodes for
consistency, and moves some of the processing of immediate shifts into
LowerShift allowing it to capture more cases.

Differential Revision: https://reviews.llvm.org/D64426

llvm-svn: 366051
2019-07-15 10:44:50 +00:00
David Green 458a720ec1 [ARM] Add sign and zero extend patterns for MVE
The vmovlb instructions can be uses to sign or zero extend vector registers
between types. This adds some patterns for them and relevant testing. The
VBICIMM generation is also put behind a hasNEON check (as is already done for
VORRIMM).

Code originally by David Sherwood.

Differential Revision: https://reviews.llvm.org/D64069

llvm-svn: 366008
2019-07-13 15:43:00 +00:00
David Green 07a7ec2021 [ARM] MVE VNEG instruction patterns
This selects integer VNEG instructions, which can be especially useful with shifts.

Differential Revision: https://reviews.llvm.org/D64204

llvm-svn: 366006
2019-07-13 15:26:51 +00:00
David Green 4ce648b5e8 [ARM] MVE integer abs
Similar to floating point abs, we also have instructions for integers.

Differential Revision: https://reviews.llvm.org/D64027

llvm-svn: 366005
2019-07-13 14:58:32 +00:00
David Green 701bf714db [ARM] MVE integer min and max
This simply makes the MVE integer min and max instructions legal and adds the
relevant patterns for them.

Differential Revision: https://reviews.llvm.org/D64026

llvm-svn: 366004
2019-07-13 14:48:54 +00:00
David Green ac5bcbeb9f [ARM] MVE VRINT support
This adds support for the floor/ceil/trunc/... series of instructions,
converting to various forms of VRINT. They use the same suffixes as their
floating point counterparts. There is not VTINTR, so nearbyint is expanded.

Also added a copysign test, to show it is expanded.

Differential Revision: https://reviews.llvm.org/D63985

llvm-svn: 366003
2019-07-13 14:38:53 +00:00
David Green ec8af0db6c [ARM] MVE minnm and maxnm instructions
This adds the patterns for minnm and maxnm from the fminnum and fmaxnum nodes,
similar to scalar types.

Original patch by Simon Tatham

Differential Revision: https://reviews.llvm.org/D63870

llvm-svn: 366002
2019-07-13 14:29:02 +00:00
Sam Parker 08b4a8da07 [ARM][LowOverheadLoops] Correct offset checking
This patch addresses a couple of problems:
1) The maximum supported offset of LE is -4094.
2) The offset of WLS also needs to be checked, this uses a
   maximum positive offset of 4094.
    
The use of BasicBlockUtils has been changed because the block offsets
weren't being initialised, but the isBBInRange checks both positive
and negative offsets.
    
ARMISelLowering has been tweaked because the test case presented
another pattern that we weren't supporting.

llvm-svn: 365749
2019-07-11 09:56:15 +00:00
Simon Tatham 7916198a41 [ARM] Remove nonexistent unsigned forms of MVE VQDMLAH.
The VQDMLAH.U8, VQDMLAH.U16 and VQDMLAH.U32 instructions don't
actually exist: the Armv8.1-M architecture spec only lists signed
forms of that instruction. The unsigned ones were added in error: they
existed in an early draft of the spec, but they were removed before
the public version, and we missed that particular spec change.

Also affects the variant forms VQDMLASH, VQRDMLAH and VQRDMLASH.

Reviewers: miyuki

Subscribers: javed.absar, kristof.beyls, hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64502

llvm-svn: 365747
2019-07-11 09:52:15 +00:00
Sam Parker 85ad78b1cf [ARM][ParallelDSP] Change the search for smlads
Two functional changes have been made here:
- Now search up from any add instruction to find the chains of
  operations that we may turn into a smlad. This allows the
  generation of a smlad which doesn't accumulate into a phi.
- The search function has been corrected to stop it falsely searching
  up through an invalid path.
    
The bulk of the changes have been making the Reduction struct a class
and making it more C++y with getters and setters.

Differential Revision: https://reviews.llvm.org/D61780

llvm-svn: 365740
2019-07-11 07:47:50 +00:00
Sam Parker 775b2f598a [NFC][ARM] Convert lambdas to static helpers
Break up and convert some of the lambdas in ARMLowOverheadLoops into
static functions. 

llvm-svn: 365623
2019-07-10 12:29:43 +00:00
Mikhail Maltsev ed143c5d59 [ARM] Enable VPUSH/VPOP aliases when either MVE or VFP is present
Summary:
Use the same predicates as VSTMDB/VLDMIA since VPUSH/VPOP alias to
these.

Patch by Momchil Velikov.

Reviewers: ostannard, simon_tatham, SjoerdMeijer, samparker, t.p.northover, dmgreen

Reviewed By: dmgreen

Subscribers: javed.absar, kristof.beyls, hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64413

llvm-svn: 365604
2019-07-10 08:59:17 +00:00
Mikhail Maltsev ee81051fc9 [ARM] Relax constraints on operands of VQxDMLxDH instructions
Summary:
According to a recently updated Armv8-M spec
(https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf) the
32-bit width versions of the following instructions:
* VQDMLADH
* VQDMLADHX
* VQRDMLADH
* VQRDMLADHX
* VQDMLSDH
* VQDMLSDHX
* VQRDMLSDH
* VQRDMLSDHX
are no longer unpredictable when their output register is the same as
one of the input registers.

This patch updates the assembler parser and the corresponding tests
and also removes @earlyclobber from the instruction constraints.

Reviewers: simon_tatham, ostannard, dmgreen, SjoerdMeijer, samparker

Reviewed By: simon_tatham

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64250

llvm-svn: 365306
2019-07-08 09:44:52 +00:00
Fangrui Song 7d63be09b6 [ARM] Fix null pointer dereference in CodeGen/ARM/Windows/stack-protector-msvc.ll.test after D64292/r365283
CLI.CS may not be set.

llvm-svn: 365299
2019-07-08 08:43:31 +00:00
Martin Storsjo 8d9d290d4c [ARM] Add support for MSVC stack cookie checking
Heavily based on the same for AArch64, from SVN r346469.

Differential Revision: https://reviews.llvm.org/D64292

llvm-svn: 365283
2019-07-07 18:57:31 +00:00
David Green 47afdaa487 [ARM] MVE patterns for VMVN, VORR and VBIC
This add simple Q register forms of bitwise not instructions.

Differential Revision: https://reviews.llvm.org/D63983

llvm-svn: 365214
2019-07-05 15:21:29 +00:00
David Green 25cf705097 [ARM] MVE VMOV immediate handling
This adds some handling for VMOVimm, using the same method that NEON uses. We
create VMOVIMM/VMVNIMM/VMOVFPIMM nodes based on the immediate, and select them
using the now renamed ARMvmovImm/etc. There is also an extra 64bit immediate
mode that I have not yet added here.

Code by David Sherwood

Differential Revision: https://reviews.llvm.org/D63884

llvm-svn: 365178
2019-07-05 10:02:43 +00:00
David Green bb7e97d783 [ARM] MVE fp to int conversions
This adds the patterns needed for fptosi and sitofp.

Differential Revision: https://reviews.llvm.org/D63729

llvm-svn: 365176
2019-07-05 09:34:30 +00:00
David Green 2b20ee4110 [ARM] Favour PL/MI over GE/LT when possible
The arm condition codes for GE is N==V (and for LT is N!=V). If the source of
flags cannot set V (overflow), such as a cmp against #0, then we can use the
simpler PL and MI conditions that only check N. As these PL/MI conditions are
simpler than GE/LT, other passes like the peephole optimiser can have a better
time optimising away the redundant CMPs.

The exception is the VSEL instruction, which cannot take the PL code, so there
the transform favours GE.

Differential Revision: https://reviews.llvm.org/D64160

llvm-svn: 365117
2019-07-04 08:58:58 +00:00
David Green d2a9ec29d0 [ARM] MVE bitwise instruction patterns
This adds patterns for the simpler VAND, VORR and VEOR bitwise vector
instructions. It also adjusts the top16Zero PatLeaf to not match on vector
instructions, which can otherwise cause problems.

Code written by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63867

llvm-svn: 365113
2019-07-04 08:41:23 +00:00
Sam Parker 6005681ac6 [ARM] Fix for NDEBUG builds
Fix unused variable warning as well as a nonsense assert.

Differential Revision: https://reviews.llvm.org/D63816

llvm-svn: 365046
2019-07-03 14:39:23 +00:00
Oliver Stannard 830b20344b [ARM] Thumb2: favor R4-R7 over R12/LR in allocation order when opt for minsize
For Thumb2, we prefer low regs (costPerUse = 0) to allow narrow
encoding. However, current allocation order is like:
  R0-R3, R12, LR, R4-R11

As a result, a lot of instructs that use R12/LR will be wide instrs.

This patch changes the allocation order to:
  R0-R7, R12, LR, R8-R11
for thumb2 and -Osize.

In most cases, there is no extra push/pop instrs as they will be folded
into existing ones. There might be slight performance impact due to more
stack usage, so we only enable it when opt for min size.

https://reviews.llvm.org/D30324

llvm-svn: 365014
2019-07-03 09:58:52 +00:00
Roman Lebedev c4b83a6054 [Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)
Summary:
This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]].
In middle-end, we'd want to prefer the form with two adds - D63992,
but as this diff shows, not every target will prefer that pattern.

Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars,
but only X86 prefer that same pattern for vectors.

Here i'm adding a new TLI hook, always defaulting to the inc-of-add,
but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars.

Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel

Reviewed By: efriedma

Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64090

llvm-svn: 365010
2019-07-03 09:41:35 +00:00
Eli Friedman e97aa961d3 [ARM] Fix unwind info for Thumb1 functions that save high registers.
There were two issues here: one, some of the relevant instructions were
missing the expected "FrameSetup" flag, and two,
ARMAsmPrinter::EmitUnwindingInstruction wasn't expecting "mov"
instructions in the prologue.

I'm sticking the additional state into ARMFunctionInfo so it's obvious
it only applies to the current function.

I considered a few alternative approaches where we would compute the
correct unwind information as part of the prologue/epilogue lowering,
but it seems like a lot of work to introduce pseudo-instructions, and
the current code seems to be reliable enough.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42408.

Differential Revision: https://reviews.llvm.org/D63964

llvm-svn: 364970
2019-07-02 21:35:15 +00:00
Simon Tatham bffd099d15 [ARM] MVE: allow soft-float ABI to pass vector types.
Passing a vector type over the soft-float ABI involves it being split
into four GPRs, so the first thing that has to happen at the start of
the function is to recombine those into a vector register. The ABI
types all vectors as v2f64, so we need to support BUILD_VECTOR for
that type, which I do in this patch by allowing it to be expanded in
terms of INSERT_VECTOR_ELT, and writing an ISel pattern for that in
turn. Similarly, I provide a rule for EXTRACT_VECTOR_ELT so that a
returned vector can be marshalled back into GPRs.

While I'm here, I've also added ISD::UNDEF to the list of operations
we turn back on in `setAllExpand`, because I noticed that otherwise it
gets expanded into a BUILD_VECTOR with explicit zero inputs, leading
to pointless machine instructions to zero out a vector register that's
about to have every lane overwritten of in any case.

Reviewers: dmgreen, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63937

llvm-svn: 364910
2019-07-02 11:26:11 +00:00
Simon Tatham 7b63a9533c [ARM] Stop using scalar FP instructions in integer-only MVE mode.
If you compile with `-mattr=+mve` (enabling integer MVE instructions
but not floating-point ones), then the scalar FP //registers// exist
and it's legal to move things in and out of them, load and store them,
but it's not legal to do arithmetic on them.

In D60708, the calls to `addRegisterClass` in ARMISelLowering that
enable use of the scalar FP registers became conditionalised on
`Subtarget->hasFPRegs()` instead of `Subtarget->hasVFP2Base()`, so
that loads, stores and moves of those registers would work. But I
didn't realise that that would also enable all the operations on those
types by default.

Now, if the target doesn't have basic VFP, we follow up those
`addRegisterClass` calls by turning back off all the nontrivial
operations you can perform on f32 and f64. That causes several
knock-on failures, which are fixed by allowing the `VMOVDcc` and
`VMOVScc` instructions to be selected even if all you have is
`HasFPRegs`, and adjusting several checks for 'is this a double in a
single-precision-only world?' to the more general 'is this any FP type
we can't do arithmetic on?'. Between those, the whole of the
`float-ops.ll` and `fp16-instructions.ll` tests can now run in
MVE-without-FP mode and generate correct-looking code.

One odd side effect is that I had to relax the check lines in that
test so that they permit test functions like `add_f` to be generated
as tailcalls to software FP library functions, instead of ordinary
calls. Doing that is entirely legal, but the mystery is why this is
the first RUN line that's needed the relaxation: on the usual kind of
non-FP target, no tailcalls ever seem to be generated. Going by the
llc messages, I think `SoftenFloatResult` must be perturbing the code
generation in some way, but that's as much as I can guess.

Reviewers: dmgreen, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63938

llvm-svn: 364909
2019-07-02 11:26:00 +00:00
Mikhail Maltsev 8b2e304bc5 [ARM] Fix MVE_VQxDMLxDH instruction class
Summary:
According to the ARMARM, the VQDMLADH, VQRDMLADH, VQDMLSDH and
VQRDMLSDH instructions handle their results as follows: "The base
variant writes the results into the lower element of each pair of
elements in the destination register, whereas the exchange variant
writes to the upper element in each pair". I.e., the initial content
of the output register affects the result, as usual, we model this
with an additional input.

Also, for 32-bit variants Qd is not allowed to be the same register as
Qm and Qn, we use @earlyclobber to indicate this.

This patch also changes vpred_r to vpred_n because the instructions
don't have an explicit 'inactive' operand.

Reviewers: dmgreen, ostannard, simon_tatham

Reviewed By: simon_tatham

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64007

llvm-svn: 364796
2019-07-01 16:07:58 +00:00
Mikhail Maltsev 4a9e3f15bb [ARM] MVE: support QQPRRegClass and QQQQPRRegClass
Summary:
QQPRRegClass and QQQQPRRegClass are used by the
interleaving/deinterleaving loads/stores to represent sequences of
consecutive SIMD registers.

Reviewers: ostannard, simon_tatham, dmgreen

Reviewed By: simon_tatham

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64009

llvm-svn: 364794
2019-07-01 16:05:23 +00:00
Sam Parker 98722691b0 [ARM] WLS/LE Code Generation
Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:
1) Use TTI to communicate to the HardwareLoop pass that we should try
   to generate intrinsics that guard the loop entry, as well as setting
   the loop trip count.
2) Lower the BRCOND that uses said intrinsic to an Arm specific node:
   ARMWLS.
3) ISelDAGToDAG the node to a new pseudo instruction:
   t2WhileLoopStart.
4) Add support in ArmLowOverheadLoops to handle the new pseudo
   instruction.

Differential Revision: https://reviews.llvm.org/D63816

llvm-svn: 364733
2019-07-01 08:21:28 +00:00
Sam Tebbs e39e958da3 [ARM] Add support for the MVE long shift instructions
MVE adds the lsll, lsrl and asrl instructions, which perform a shift on a 64 bit value separated into two 32 bit registers.

The Expand64BitShift function is modified to accept ISD::SHL, ISD::SRL and ISD::SRA and convert it into the appropriate opcode in ARMISD. An SHL is converted into an lsll, an SRL is converted into an lsrl for the immediate form and a negation and lsll for the register form, and SRA is converted into an asrl.

test/CodeGen/ARM/shift_parts.ll is added to test the logic of emitting these instructions.

Differential Revision: https://reviews.llvm.org/D63430

llvm-svn: 364654
2019-06-28 15:43:31 +00:00
David Green 9dbdfe6b78 [ARM] Add MVE mul patterns
This simply adds integer and floating point VMUL patterns for MVE, same as we
have add and sub.

Differential Revision: https://reviews.llvm.org/D63866

llvm-svn: 364643
2019-06-28 11:44:03 +00:00
David Green 2883944035 [ARM] Mark math routines as non-legal for MVE
This adds handling and tests for a number of floating point math routines,
which have no MVE instructions.

Differential Revision: https://reviews.llvm.org/D63725

llvm-svn: 364641
2019-06-28 11:17:38 +00:00
David Green ff70cbc895 [ARM] MVE patterns for VABS and VNEG
This simply adds the required patterns for fp neg and abs.

Differential Revision: https://reviews.llvm.org/D63861

llvm-svn: 364640
2019-06-28 10:25:35 +00:00
David Green eb7080ac6e [ARM] Widening loads and narrowing stores
MVE has instructions to widen as it loads, and narrow as it stores. This adds
the required patterns and legalisation to make them work including specifying
that they are legal, patterns to select them and test changes.

Patch by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63839

llvm-svn: 364636
2019-06-28 09:47:55 +00:00
Simon Tatham 29ff1b4f46 [ARM] Fix integer UB in MVE load/store immediate handling.
llvm-svn: 364635
2019-06-28 09:28:39 +00:00
David Green 07e53fee14 [ARM] MVE loads and stores
This fills in the gaps for basic MVE loads and stores, allowing unaligned
access and adding far too many tests. These will become important as
narrowing/expanding and pre/post inc are added. Big endian might still not be
handled very well, because we have not yet added bitcasts (and I'm not sure how
we want it to work yet). I've included the alignment code anyway which maps
with our current patterns. We plan to return to that later.

Code written by Simon Tatham, with additional tests from Me and Mikhail Maltsev.

Differential Revision: https://reviews.llvm.org/D63838

llvm-svn: 364633
2019-06-28 08:41:40 +00:00