Commit Graph

10105 Commits

Author SHA1 Message Date
David Green 8be372b190 [ARM] MVE vector shuffles
This patch adds necessary shuffle vector and buildvector support for ARM MVE.
It essentially adds support for VDUP, VREVs and some VMOVs, which are often
required by other code (like upcoming patches).

This mostly uses the same code from Neon that already generated
NEONvdup/NEONvduplane/NEONvrev's. These have been renamed to ARMvdup/etc and
moved to ARMInstrInfo as they are common to both architectures. Most of the
selection code seems to be applicable to both, but NEON does have some more
instructions making some parts specific.

Most code originally by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63567

llvm-svn: 364626
2019-06-28 07:08:42 +00:00
Sam Tebbs 8747c5f482 [ARM] Fix formatting issue in ARMISelLowering.cpp
Fix a formatting error in ARMISelLowering.cpp::Expand64BitShift. My test
commit after receiving write access.

llvm-svn: 364560
2019-06-27 16:28:28 +00:00
Simon Tatham 1a3dc8f678 [ARM] Fix bogus assertions in copyPhysReg v8.1-M cases.
The code to generate register move instructions in and out of VPR and
FPSCR_NZCV had assertions checking that the other register involved
was a GPR _pair_, instead of a single GPR as it should have been.

Reviewers: miyuki, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63865

llvm-svn: 364534
2019-06-27 12:41:12 +00:00
Simon Tatham ffb2b347ff [ARM] Fix handling of zero offsets in LOB instructions.
The BF and WLS/WLSTP instructions have various branch-offset fields
occupying different positions and lengths in the instruction encoding,
and all of them were decoded at disassembly time by the function
DecodeBFLabelOffset() which returned SoftFail if the offset was zero.

In fact, it's perfectly fine and not even a SoftFail for most of those
offset fields to be zero. The only one that can't be zero is the 4-bit
field labelled `boff` in the architecture spec, occupying bits {26-23}
of the BF instruction family. If that one is zero, the encoding
overlaps other instructions (WLS, DLS, LETP, VCTP), so it ought to be
a full Fail.

Fixed by adding an extra template parameter to DecodeBFLabelOffset
which controls whether a zero offset is accepted or rejected. Adjusted
existing tests (only in error messages for bad disassemblies); added
extra tests to demonstrate zero offsets being accepted in all the
right places, and a few demonstrating rejection of zero `boff`.

Reviewers: DavidSpickett, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63864

llvm-svn: 364533
2019-06-27 12:41:07 +00:00
Simon Tatham e5ce56fb95 [ARM] Make coprocessor number restrictions consistent.
Different versions of the Arm architecture disallow the use of generic
coprocessor instructions like MCR and CDP on different sets of
coprocessors. This commit centralises the check of the coprocessor
number so that it's consistent between assembly and disassembly, and
also updates it for the new restrictions in Arm v8.1-M.

New tests added that check all the coprocessor numbers; old tests
updated, where they used a number that's now become illegal in the
context in question.

Reviewers: DavidSpickett, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63863

llvm-svn: 364532
2019-06-27 12:40:55 +00:00
Simon Tatham 02449f9c3c [ARM] Tighten restrictions on use of SP in v8.1-M CSEL.
In the `CSEL Rd,Rm,Rn` instruction family (also including CSINC, CSINV
and CSNEG), the architecture lists it as CONSTRAINED UNPREDICTABLE
(i.e. SoftFail) to use SP in the Rd or Rm slot, but outright illegal
to use it in the Rn slot, not least because some encodings of that
form are used by MVE instructions such as UQRSHLL.

MC was treating all three slots the same, as SoftFail. So the only
reason UQRSHLL was disassembled correctly at all was because the MVE
decode table is separate from the Thumb2 one and takes priority; if
you turned off MVE, then encodings such as `[0x5f,0xea,0x0d,0x83]`
would disassemble as spurious CSELs.

Fixed by inventing another version of the `GPRwithZR` register class,
which disallows SP completely instead of just SoftFailing it.

Reviewers: DavidSpickett, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63862

llvm-svn: 364531
2019-06-27 12:40:40 +00:00
Diana Picus 43fb5ae50c [GlobalISel] Accept multiple vregs for lowerCall's args
Change the interface of CallLowering::lowerCall to accept several
virtual registers for each argument, instead of just one.  This is a
follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.

With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

NFCI for AMDGPU, Mips and X86.

Differential Revision: https://reviews.llvm.org/D63551

llvm-svn: 364512
2019-06-27 09:18:03 +00:00
Diana Picus 8138996128 [GlobalISel] Accept multiple vregs for lowerCall's result
Change the interface of CallLowering::lowerCall to accept several
virtual registers for the call result, instead of just one.  This is a
follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.

With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

NFCI for AMDGPU, Mips and X86.

Differential Revision: https://reviews.llvm.org/D63550

llvm-svn: 364511
2019-06-27 09:15:53 +00:00
Diana Picus c3dbe23977 [GlobalISel] Accept multiple vregs in lowerFormalArgs
Change the interface of CallLowering::lowerFormalArguments to accept
several virtual registers for each formal argument, instead of just one.
This is a follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660. lowerCall
will be refactored in the same way in follow-up patches.

With this change, we forward the virtual registers generated for
aggregates to CallLowering. Therefore, the target can decide itself
whether it wants to handle them as separate pieces or use one big
register. We also copy the pack/unpackRegs helpers to CallLowering to
facilitate this.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was
put into a s64 instead of a p0. Added a test-case which illustrates the
problem more clearly (it crashes without this patch) and fixed the
existing test-case to expect p0.

AMDGPU has been updated to unpack into the virtual registers for
kernels. I think the other code paths fall back for aggregates, so this
should be NFC.

Mips doesn't support aggregates yet, so it's also NFC.

x86 seems to have code for dealing with aggregates, but I couldn't find
the tests for it, so I just added a fallback to DAGISel if we get more
than one virtual register for an argument.

Differential Revision: https://reviews.llvm.org/D63549

llvm-svn: 364510
2019-06-27 08:54:17 +00:00
Diana Picus 69ce1c1319 [GlobalISel] Allow multiple VRegs in ArgInfo. NFC
Allow CallLowering::ArgInfo to contain more than one virtual register.
This is useful when passes split aggregates into several virtual
registers, but need to also provide information about the original type
to the call lowering. Used in follow-up patches.

Differential Revision: https://reviews.llvm.org/D63548

llvm-svn: 364509
2019-06-27 08:50:53 +00:00
Eli Friedman ab1d73ee32 [ARM] Don't reserve R12 on Thumb1 as an emergency spill slot.
The current implementation of ThumbRegisterInfo::saveScavengerRegister
is bad for two reasons: one, it's buggy, and two, it blocks using R12
for other optimizations.  So this patch gets rid of it, and adds the
necessary support for using an ordinary emergency spill slot on Thumb1.

(Specifically, I think saveScavengerRegister was broken by r305625, and
nobody noticed for two years because the codepath is almost never used.
The new code will also probably not be used much, but it now has better
tests, and if we fail to emit a necessary emergency spill slot we get a
reasonable error message instead of a miscompile.)

A rough outline of the changes in the patch:

1. Gets rid of ThumbRegisterInfo::saveScavengerRegister.
2. Modifies ARMFrameLowering::determineCalleeSaves to allocate an
emergency spill slot for Thumb1.
3. Implements useFPForScavengingIndex, so the emergency spill slot isn't
placed at a negative offset from FP on Thumb1.
4. Modifies the heuristics for allocating an emergency spill slot to
support Thumb1.  This includes fixing ExtraCSSpill so we don't try to
use "lr" as a substitute for allocating an emergency spill slot.
5. Allocates a base pointer in more cases, so the emergency spill slot
is always accessible.
6. Modifies ARMFrameLowering::ResolveFrameIndexReference to compute the
right offset in the new cases where we're forcing a base pointer.
7. Ensures we never generate a load or store with an offset outside of
its frame object.  This makes the heuristics more straightforward.
8. Changes Thumb1 prologue and epilogue emission so it never uses
register scavenging.

Some of the changes to the emergency spill slot heuristics in
determineCalleeSaves affect ARM/Thumb2; hopefully, they should allow
the compiler to avoid allocating an emergency spill slot in cases
where it isn't necessary. The rest of the changes should only affect
Thumb1.

Differential Revision: https://reviews.llvm.org/D63677

llvm-svn: 364490
2019-06-26 23:46:51 +00:00
Mikhail Maltsev 6dcbb3161e [ARM] Handle fixup_arm_pcrel_9 correctly on big-endian targets
Summary:
The getFixupKindContainerSizeBytes function returns the size of the
instruction containing a given fixup. Currently fixup_arm_pcrel_9 is
not handled in this function, this causes an assertion failure in
the debug build and incorrect codegen in the release build.

This patch fixes the problem.

Reviewers: ostannard, simon_tatham

Reviewed By: ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63778

llvm-svn: 364404
2019-06-26 10:48:40 +00:00
Fangrui Song 6a4c68e187 [ARM] Fix -Wimplicit-fallthrough after D60709/r364331
llvm-svn: 364376
2019-06-26 02:34:10 +00:00
Simon Tatham e8de8ba6a6 [ARM] Support inline assembler constraints for MVE.
"To" selects an odd-numbered GPR, and "Te" an even one. There are some
8.1-M instructions that have one too few bits in their register fields
and require registers of particular parity, without necessarily using
a consecutive even/odd pair.

Also, the constraint letter "t" should select an MVE q-register, when
MVE is present. This didn't need any source changes, but some extra
tests have been added.

Reviewers: dmgreen, samparker, SjoerdMeijer

Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D60709

llvm-svn: 364331
2019-06-25 16:49:32 +00:00
Simon Tatham a4b415a683 [ARM] Code-generation infrastructure for MVE.
This provides the low-level support to start using MVE vector types in
LLVM IR, loading and storing them, passing them to __asm__ statements
containing hand-written MVE vector instructions, and *if* you have the
hard-float ABI turned on, using them as function parameters.

(In the soft-float ABI, vector types are passed in integer registers,
and combining all those 32-bit integers into a q-reg requires support
for selection DAG nodes like insert_vector_elt and build_vector which
aren't implemented yet for MVE. In fact I've also had to add
`arm_aapcs_vfpcc` to a couple of existing tests to avoid that
problem.)

Specifically, this commit adds support for:

 * spills, reloads and register moves for MVE vector registers

 * ditto for the VPT predication mask that lives in VPR.P0

 * make all the MVE vector types legal in ISel, and provide selection
   DAG patterns for BITCAST, LOAD and STORE

 * make loads and stores of scalar FP types conditional on
   `hasFPRegs()` rather than `hasVFP2Base()`. As a result a few
   existing tests needed their llc command lines updating to use
   `-mattr=-fpregs` as their method of turning off all hardware FP
   support.

Reviewers: dmgreen, samparker, SjoerdMeijer

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60708

llvm-svn: 364329
2019-06-25 16:48:46 +00:00
Sam Parker bcf0eb7a64 [ARM] Fix for DLS/LE CodeGen
The expensive buildbots highlighted the mir tests were broken, which
I've now updated and added --verify-machineinstrs to them. This also
uncovered a couple of bugs in the backend pass, so these have also
been fixed.

llvm-svn: 364323
2019-06-25 15:11:17 +00:00
Fangrui Song 807d2f442a [ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D60692
llvm-svn: 364312
2019-06-25 13:28:44 +00:00
Simon Tatham 287f0403e3 [ARM] Fix buildbot failure due to -Werror.
Including both 'case ARM_AM::uxtw' and 'default' in the getShiftOp
switch caused a buildbot to fail with

error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
llvm-svn: 364300
2019-06-25 12:23:46 +00:00
Sjoerd Meijer 74ec25a197 [ARM] MVE VPT Blocks
A minor iteration on the MVE VPT Block pass to enable more efficient VPT Block
code generation: consecutive VPT predicated statements, predicated on the same
condition, will be placed within the same VPT Block. This essentially is also
an exercise to write some more tests for the next step, which should be more
generic also merging instructions when they are not consecutive.

Differential Revision: https://reviews.llvm.org/D63711

llvm-svn: 364298
2019-06-25 12:04:31 +00:00
Simon Tatham 4cf18c2849 [ARM] Explicit lowering of half <-> double conversions.
If an FP_EXTEND or FP_ROUND isel dag node converts directly between
f16 and f32 when the target CPU has no instruction to do it in one go,
it has to be done in two steps instead, going via f32.

Previously, this was done implicitly, because all such CPUs had the
storage-only implementation of f16 (i.e. the only thing you can do
with one at all is to convert it to/from f32). So isel would legalize
the f16 into an f32 as soon as it saw it, by inserting an fp16_to_fp
node (or vice versa), and then the fp_extend would already be f32->f64
rather than f16->f64.

But that technique can't support a target CPU which has full f16
support but _not_ f64, such as some variants of Arm v8.1-M. So now we
provide custom lowering for FP_EXTEND and FP_ROUND, which checks
support for f16 and f64 and decides on the best thing to do given the
combination of flags it gets back.

Reviewers: dmgreen, samparker, SjoerdMeijer

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60692

llvm-svn: 364294
2019-06-25 11:24:50 +00:00
Simon Tatham 86b7a1e660 [ARM] Add remaining miscellaneous MVE instructions.
This final batch includes the tail-predicated versions of the
low-overhead loop instructions (LETP); the VPSEL instruction to select
between two vector registers based on the predicate mask without
having to open a VPT block; and VPNOT which complements the predicate
mask in place.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62681

llvm-svn: 364292
2019-06-25 11:24:33 +00:00
Simon Tatham e6824160dd [ARM] Add MVE vector load/store instructions.
This adds the rest of the vector memory access instructions. It
includes contiguous loads/stores, with an ordinary addressing mode
such as [r0,#offset] (plus writeback variants); gather loads and
scatter stores with a scalar base address register and a vector of
offsets from it (written [r0,q1] or similar); and gather/scatters with
a vector of base addresses (written [q0,#offset], again with
writeback). Additionally, some of the loads can widen each loaded
value into a larger vector lane, and the corresponding stores narrow
them again.

To implement these, we also have to add the addressing modes they
need. Also, in AsmParser, the `isMem` query function now has
subqueries `isGPRMem` and `isMVEMem`, according to which kind of base
register is used by a given memory access operand.

I've also had to add an extra check in `checkTargetMatchPredicate` in
the AsmParser, without which our last-minute check of `rGPR` register
operands against SP and PC was failing an assertion because Tablegen
had inserted an immediate 0 in place of one of a pair of tied register
operands. (This matches the way the corresponding check for `MCK_rGPR`
in `validateTargetOperandClass` is guarded.) Apparently the MVE load
instructions were the first to have ever triggered this assertion, but
I think only because they were the first to have a combination of the
usual Arm pre/post writeback system and the `rGPR` class in particular.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62680

llvm-svn: 364291
2019-06-25 11:24:18 +00:00
Sam Parker a6fd919cb3 [ARM] DLS/LE low-overhead loop code generation
Introduce three pseudo instructions to be used during DAG ISel to
represent v8.1-m low-overhead loops. One maps to set_loop_iterations
while loop_decrement_reg is lowered to two, so that we can separate
the decrement and branching operations. The pseudo instructions are
expanded pre-emission, where we can still decide whether we actually
want to generate a low-overhead loop, in a new pass:
ARMLowOverheadLoops. The pass currently bails, reverting to an sub,
icmp and br, in the cases where a call or stack spill/restore happens
between the decrement and branching instructions, or if the loop is
too large.

Differential Revision: https://reviews.llvm.org/D63476

llvm-svn: 364288
2019-06-25 10:45:51 +00:00
Matt Arsenault faeaedf8e9 GlobalISel: Remove unsigned variant of SrcOp
Force using Register.

One downside is the generated register enums require explicit
conversion.

llvm-svn: 364194
2019-06-24 16:16:12 +00:00
Matt Arsenault e3a676e9ad CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().

llvm-svn: 364191
2019-06-24 15:50:29 +00:00
Simon Tatham fe8017621e [ARM] Add MVE interleaving load/store family.
This adds the family of loads and stores with names like VLD20.8 and
VST42.32, which load and store parts of multiple q-registers in such a
way that executing both VLD20 and VLD21, or all four of VLD40..VLD43,
will distribute 2 or 4 vectors' worth of memory data across the lanes
of the same number of registers but in a transposed order.

In addition to the Tablegen descriptions of the instructions
themselves, this patch also adds encode and decode support for the
QQPR and QQQQPR register classes (representing the range of loaded or
stored vector registers), and tweaks to the parsing system for lists
of vector registers to make it return the right format in this case
(since, unlike NEON, MVE regards q-registers as primitive, and not
just an alias for two d-registers).

llvm-svn: 364172
2019-06-24 10:00:39 +00:00
Simon Pilgrim bdea88325f Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.
llvm-svn: 364068
2019-06-21 16:11:18 +00:00
Simon Tatham 0c7af66450 [ARM] Add MVE 64-bit GPR <-> vector move instructions.
These instructions let you load half a vector register at once from
two general-purpose registers, or vice versa.

The assembly syntax for these instructions mentions the vector
register name twice. For the move _into_ a vector register, the MC
operand list also has to mention the register name twice (once as the
output, and once as an input to represent where the unchanged half of
the output register comes from). So we can conveniently assign one of
the two asm operands to be the output $Qd, and the other $QdSrc, which
avoids confusing the auto-generated AsmMatcher too much. For the move
_from_ a vector register, there's no way to get round the fact that
both instances of that register name have to be inputs, so we need a
custom AsmMatchConverter to avoid generating two separate output MC
operands. (And even that wouldn't have worked if it hadn't been for
D60695.)

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62679

llvm-svn: 364041
2019-06-21 13:17:23 +00:00
Simon Tatham bafb105e96 [ARM] Add MVE vector instructions that take a scalar input.
This adds the `MVE_qDest_rSrc` superclass and all its instances, plus
a few other instructions that also take a scalar input register or two.

I've also belatedly added custom diagnostic messages to the operand
classes for odd- and even-numbered GPRs, which required matching
changes in two of the existing MVE assembly test files.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62678

llvm-svn: 364040
2019-06-21 13:17:08 +00:00
Simon Tatham a6b6a15701 [ARM] Add a batch of similarly encoded MVE instructions.
Summary:
This adds the `MVE_qDest_qSrc` superclass and all instructions that
inherit from it. It's not the complete class of _everything_ with a
q-register as both destination and source; it's a subset of them that
all have similar encodings (but it would have been hopelessly unwieldy
to call it anything like MVE_111x11100).

This category includes add/sub with carry; long multiplies; halving
multiplies; multiply and accumulate, and some more complex
instructions.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62677

llvm-svn: 364037
2019-06-21 12:13:59 +00:00
Fangrui Song d5cf95e41c [ARM] Fix -Wimplicit-fallthrough after D62675
llvm-svn: 364028
2019-06-21 11:19:11 +00:00
Simon Tatham 7d76f8acf0 [ARM] Add MVE vector compare instructions.
Summary:
These take a pair of vector register to compare, and a comparison type
(written in the form of an Arm condition suffix); they output a vector
of booleans in the VPR register, where predication can conveniently
use them.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62676

llvm-svn: 364027
2019-06-21 11:14:51 +00:00
Simon Tatham c9b2cd4674 [ARM] Add a batch of MVE floating-point instructions.
Summary:
This includes floating-point basic arithmetic (add/sub/multiply),
complex add/multiply, unary negation and absolute value, rounding to
integer value, and conversion to/from integer formats.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62675

llvm-svn: 364013
2019-06-21 09:35:07 +00:00
Fangrui Song dc8de6037c Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC
llvm-svn: 364006
2019-06-21 05:40:31 +00:00
Eli Friedman 25f08a17c3 [ARM GlobalISel] Add support for s64 G_ADD and G_SUB.
Teach RegisterBankInfo to use the correct register class, and tell the
legalizer it's legal.  Everything else just works.

The one thing that's slightly weird about this compared to SelectionDAG
isel is that legalization can't distinguish between i64 and <1 x i64>,
so we might end up with more NEON instructions than the user expects.

Differential Revision: https://reviews.llvm.org/D63585

llvm-svn: 363989
2019-06-20 21:56:47 +00:00
Simon Tatham 232db11020 [ARM] Add a batch of MVE integer instructions.
This includes integer arithmetic of various kinds (add/sub/multiply,
saturating and not), and the immediate forms of VMOV and VMVN that
load an immediate into all lanes of a vector.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62674

llvm-svn: 363936
2019-06-20 15:16:56 +00:00
Eli Friedman d88e28d13e [llvm-objdump] Switch between ARM/Thumb based on mapping symbols.
The ARMDisassembler changes allow changing between ARM and Thumb mode
based on the MCSubtargetInfo, rather than the Target, which simplifies
the other changes a bit.

I'm not really happy with adding more target-specific logic to
tools/llvm-objdump/, but there isn't any easy way around it: the logic
in question specifically applies to disassembling an object file, and
that code simply isn't located in lib/Target, at least at the moment.

Differential Revision: https://reviews.llvm.org/D60927

llvm-svn: 363903
2019-06-20 00:29:40 +00:00
Simon Tatham 2f5188fd58 [ARM] Add MVE vector bit-operations (register inputs).
This includes all the obvious bitwise operations (AND, OR, BIC, ORN,
MVN) in register-to-register forms, and the immediate forms of
AND/OR/BIC/ORN; byte-order reverse instructions; and the VMOVs that
access a single lane of a vector.

Some of those VMOVs (specifically, the ones that access a 32-bit lane)
share an encoding with existing instructions that were disassembled as
accessing half of a d-register (e.g. `vmov.32 r0, d1[0]`), but in
8.1-M they're now written as accessing a quarter of a q-register (e.g.
`vmov.32 r0, q0[2]`). The older syntax is still accepted by the
assembler.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62673

llvm-svn: 363838
2019-06-19 16:43:53 +00:00
Chen Zheng c5b918de58 [NFC] move some hardware loop checking code to a common place for other using.
Differential Revision: https://reviews.llvm.org/D63478

llvm-svn: 363758
2019-06-19 01:26:31 +00:00
Huihui Zhang d16779a732 [ARM] Comply with rules on ARMv8-A thumb mode partial deprecation of IT.
Summary:
When identifing instructions that can be folded into a MOVCC instruction,
checking for a predicate operand is not enough, also need to check for
thumb2 function, with restrict-IT, is the machine instruction eligible for
ARMv8 IT or not.

Notes in ARMv8-A Architecture Reference Manual, section "Partial deprecation of IT"
  https://usermanual.wiki/Pdf/ARM20Architecture20Reference20ManualARMv8.1667877052.pdf

"ARMv8-A deprecates some uses of the T32 IT instruction. All uses of IT that apply to
instructions other than a single subsequent 16-bit instruction from a restricted set
are deprecated, as are explicit references to the PC within that single 16-bit
instruction. This permits the non-deprecated forms of IT and subsequent instructions
to be treated as a single 32-bit conditional instruction."

Reviewers: efriedma, lebedev.ri, t.p.northover, jmolloy, aemerson, compnerd, stoklund, ostannard

Reviewed By: ostannard

Subscribers: ostannard, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63474

llvm-svn: 363739
2019-06-18 20:55:09 +00:00
Simon Tatham cfc70782d7 [ARM] Add MVE vector shift instructions.
This includes saturating and non-saturating shifts, both with
immediate shift count and with the shift counts given by another
vector register; VSHLC (in which the bits shifted out of each active
vector lane are shifted in to the next active lane); and also VMOVL,
which is enough like an immediate shift that it didn't fit too badly
in this category.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62672

llvm-svn: 363696
2019-06-18 16:19:59 +00:00
Simon Tatham faaf1a5366 [ARM] Add MVE integer vector min/max instructions.
Summary:
These form a small family of their own, to go with the floating-point
VMINNM/VMAXNM instructions added in a previous commit.

They introduce the first of many special cases in the mnemonic
recognition code, because VMIN with the E suffix used by the VPT
predication system needs to avoid being interpreted as the nonexistent
instruction 'VMI' with an ordinary 'NE' condition suffix.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62671

llvm-svn: 363695
2019-06-18 15:51:46 +00:00
Simon Tatham ed4a602515 [ARM] Rename MVE instructions in Tablegen for consistency.
Summary:
Their names began with a mishmash of `MVE_`, `t2` and no prefix at
all. Now they all start with `MVE_`, which seems like a reasonable
choice on the grounds that (a) NEON is the thing they're most at risk
of being confused with, and (b) MVE implies Thumb-2, so a prefix
indicating MVE is strictly more specific than one indicating Thumb-2.

Reviewers: ostannard, SjoerdMeijer, dmgreen

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63492

llvm-svn: 363690
2019-06-18 15:05:42 +00:00
Sjoerd Meijer 7a7009f7c8 [ARM] Some Thumb2ITBlock clean ups. NFC
Some more refactoring, like registering the IT Block pass, less cryptic
variable names, and some simplification of loops.

Differential Revision: https://reviews.llvm.org/D63419

llvm-svn: 363666
2019-06-18 12:13:11 +00:00
Sam Parker 1bd3d00e7e [CodeGen] Check for HardwareLoop Latch ExitBlock
The HardwareLoops pass finds exit blocks with a scevable exit count.
If the target specifies to update the loop counter in a register,
through a phi, we need to ensure that the exit block is a latch so
that we can insert the phi with the correct value for the incoming
edge.

Differential Revision: https://reviews.llvm.org/D63336

llvm-svn: 363556
2019-06-17 13:39:28 +00:00
Fangrui Song 4bde5d3c08 [ARM] Fix another -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63265
llvm-svn: 363535
2019-06-17 09:29:50 +00:00
Fangrui Song 89d6905c59 [ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63265
llvm-svn: 363534
2019-06-17 09:26:50 +00:00
Sam Parker a059efa885 [ARM] Remove ARMComputeBlockSize
Forgot to remove file!

llvm-svn: 363532
2019-06-17 09:13:10 +00:00
Sam Parker f7c0b3aeb2 [ARM] Add ARMBasicBlockInfo.cpp
Forgot to add file!

llvm-svn: 363531
2019-06-17 09:05:43 +00:00
Sam Parker 966f4e874e [ARM] Extract some code from ARMConstantIslandPass
Create the ARMBasicBlockUtils class for tracking and querying basic
blocks sizes so we can use them when generating low-overhead loops.

Differential Revision: https://reviews.llvm.org/D63265

llvm-svn: 363530
2019-06-17 08:49:09 +00:00