Commit Graph

10046 Commits

Author SHA1 Message Date
Mikhail Maltsev 7bd5c55cad [ARM] First MVE instructions: scalar shifts.
This introduces a new decoding table for MVE instructions, and starts
by adding the family of scalar shift instructions that are part of the
MVE architecture extension: saturating shifts within a single GPR, and
long shifts across a pair of GPRs (both saturating and normal).

Some of these shift instructions have only 3-bit register fields in
the encoding, with the low bit fixed. So they can only address an odd
or even numbered GPR (depending on the operand), and therefore I add
two new register classes, GPREven and GPROdd.

Differential Revision: https://reviews.llvm.org/D62668

Change-Id: Iad95d5f83d26aef70c674027a184a6b1e0098d33
llvm-svn: 363051
2019-06-11 12:04:32 +00:00
Simon Tatham 14241378d3 [ARM] Fix unused-variable warning in rL363039.
The variable `OffsetMask` is currently only used in an assertion, so
if assertions are compiled out and -Werror is enabled, it becomes a
build failure.

llvm-svn: 363043
2019-06-11 10:09:12 +00:00
Simon Tatham 8c865cacda [ARM] Add the non-MVE instructions in Arm v8.1-M.
This adds support for the new family of conditional selection /
increment / negation instructions; the low-overhead branch
instructions (e.g. BF, WLS, DLS); the CLRM instruction to zero a whole
list of registers at once; the new VMRS/VMSR and VLDR/VSTR
instructions to get data in and out of 8.1-M system registers,
particularly including the new VPR register used by MVE vector
predication.

To support this, we also add a register name 'zr' (used by the CSEL
family to force one of the inputs to the constant 0), and operand
types for lists of registers that are also allowed to include APSR or
VPR (used by CLRM). The VLDR/VSTR instructions also need a new
addressing mode.

The low-overhead branch instructions exist in their own separate
architecture extension, which we treat as enabled by default, but you
can say -mattr=-lob or equivalent to turn it off.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Reviewed By: samparker

Subscribers: miyuki, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62667

llvm-svn: 363039
2019-06-11 09:29:18 +00:00
Tom Stellard 4b0b26199b Revert CMake: Make most target symbols hidden by default
This reverts r362990 (git commit 374571301d)

This was causing linker warnings on Darwin:

ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)'
from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol
'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&),
std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)'
means the weak symbol cannot be overridden at runtime. This was likely caused by different translation
units being compiled with different visibility settings.

llvm-svn: 363028
2019-06-11 03:21:13 +00:00
Tom Stellard 374571301d CMake: Make most target symbols hidden by default
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.

A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.

This patch reduces the number of public symbols in libLLVM.so by about
25%.  This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so

One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.

Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278

Reviewers: chandlerc, beanz, mgorny, rnk, hans

Reviewed By: rnk, hans

Subscribers: Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D54439

llvm-svn: 362990
2019-06-10 22:12:56 +00:00
Simon Tatham 67065c5c70 Revert rL362953 and its followup rL362955.
These caused a build failure because I managed not to notice they
depended on a later unpushed commit in my current stack. Sorry about
that.

llvm-svn: 362956
2019-06-10 15:58:19 +00:00
Simon Tatham 42078d41d5 [ARM] Add the non-MVE instructions in Arm v8.1-M.
This should have been part of r362953, but I had a finger-trouble
incident and committed the old rather than new version of the patch.
Sorry.

llvm-svn: 362955
2019-06-10 15:41:58 +00:00
Simon Tatham baeea91933 [ARM] Add the non-MVE instructions in Arm v8.1-M.
This adds support for the new family of conditional selection /
increment / negation instructions; the low-overhead branch
instructions (e.g. BF, WLS, DLS); the CLRM instruction to zero a whole
list of registers at once; the new VMRS/VMSR and VLDR/VSTR
instructions to get data in and out of 8.1-M system registers,
particularly including the new VPR register used by MVE vector
predication.

To support this, we also add a register name 'zr' (used by the CSEL
family to force one of the inputs to the constant 0), and operand
types for lists of registers that are also allowed to include APSR or
VPR (used by CLRM). The VLDR/VSTR instructions also need some new
addressing modes.

The low-overhead branch instructions exist in their own separate
architecture extension, which we treat as enabled by default, but you
can say -mattr=-lob or equivalent to turn it off.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Reviewed By: samparker

Subscribers: miyuki, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62667

llvm-svn: 362953
2019-06-10 15:36:34 +00:00
Simon Tatham b87669f166 [ARM] Disallow PC, and optionally SP, in VMOVRH and VMOVHR.
Arm v8.1-M supports the VMOV instructions that move a half-precision
value to and from a GPR, but not if the GPR is SP or PC.

To fix this, I've changed those instructions to use the rGPR register
class instead of GPR. rGPR always excludes PC, and it excludes SP
except in the presence of the HasV8Ops target feature (i.e. Arm v8-A).
So the effect is that VMOV.F16 to and from PC is now illegal
everywhere, but VMOV.F16 to and from SP is illegal only on non-v8-A
cores (which I believe is all as it should be).

Reviewers: dmgreen, samparker, SjoerdMeijer, ostannard

Reviewed By: ostannard

Subscribers: ostannard, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60704

llvm-svn: 362942
2019-06-10 14:43:55 +00:00
David Green d847aa573b [ARM] Enable Unroll UpperBound
This option allows loops with small max trip counts to be fully unrolled. This
can help with code like the remainder loops from manually unrolled loops like
those that appear in the cmsis dsp library. We would apparently previously
runtime unroll them with the default unroll count (4).

Differential Revision: https://reviews.llvm.org/D63064

llvm-svn: 362928
2019-06-10 10:22:14 +00:00
David Green c5471c2a57 [ARM] Adjust isLegalT1AddressImmediate for non-legal types
Types such as float and i64's do not have legal loads in Thumb1, but will still
be loaded with a LDR (or potentially multiple LDR's). As such we can treat the
cost of addressing mode calculations the same as an i32 and get some optimisation
benefits.

Differential Revision: https://reviews.llvm.org/D62968

llvm-svn: 362874
2019-06-08 10:32:53 +00:00
David Green 342d1b81a3 [ARM] Add MVE addressing to isLegalT2AddressImmediate
Now with MVE being added, we can add the vector addressing mode costs for it.
These are generally imm7 multiplied by the size of the type being loaded /
stored.

Differential Revision: https://reviews.llvm.org/D62967

llvm-svn: 362873
2019-06-08 10:18:23 +00:00
David Green 4ecce205d5 [ARM] Add fp16 addressing to isLegalT2AddressImmediate
The fp16 version of VLDR takes a imm8 multiplied by 2. This updates the costs
to account for those, and adds extra testing. It is dependant upon hasFPRegs16
as this is what the load/store instructions require.

Differential Revision: https://reviews.llvm.org/D62966

llvm-svn: 362872
2019-06-08 10:09:02 +00:00
David Green 10fbaa96c5 [ARM] Add HasNEON for all Neon patterns in ARMInstrNEON.td. NFCI
We are starting to add an entirely separate vector architecture to the ARM
backend. To do that we need at least some separation between the existing NEON
and the new MVE code. This patch just goes through the Neon patterns and
ensures that they are predicated on HasNEON, giving MVE a stable place to start
from.

No tests yet as this is largely an NFC, and we don't have the other target that
will treat any of these intructions as legal.

Differential Revision: https://reviews.llvm.org/D62945

llvm-svn: 362870
2019-06-08 09:36:49 +00:00
Mikhail Maltsev 08da01b496 [ARM] Add FP16 vector insert/extract patterns
This change adds two FP16 extraction and two insertion patterns
(one per possible vector length).
Extractions are handled by copying a Q/D register into one of VFP2
class registers, where single FP32 sub-registers can be accessed. Then
the extraction of even lanes are simple sub-register extractions
(because we don't care about the top parts of registers for FP16
operations). Odd lanes need an additional VMOVX instruction.

Unfortunately, insertions cannot be handled in the same way, because:
* There is no instruction to insert FP16 into an even lane (VINS only
  works with odd lanes)
* The patterns for odd lanes will have a form of a DAG (not a tree),
  and will not be implementable in pure tablegen

Because of this insertions are handled in the same way as 16-bit
integer insertions (with conversions between FP registers and GPRs
using VMOVHR instructions).

Without these patterns the ARM backend would sometimes fail during
instruction selection.

This patch also adds patterns which combine:
* an FP16 element extraction and a store into a single VST1
  instruction
* an FP16 load and insertion into a single VLD1 instruction

Differential Revision: https://reviews.llvm.org/D62651

llvm-svn: 362482
2019-06-04 09:39:55 +00:00
Simon Tatham ac02445524 [ARM] Turn some undefined encoding bits into 0s.
The family of 32-bit Thumb instruction encodings that include t2ORR,
t2AND and t2EOR are all listed in the ArmARM as having (0) in bit 15.
The Tablegen descriptions of those instructions listed them as ?. This
change tightens that up by making them into 0 + Unpredictable.

In the specific case of t2ORR, we tighten it up still further by
making the zero bit mandatory. This change comes from Arm v8.1-M, in
which encodings with that bit equal to 1 will now be used for
different instructions.


Reviewers: dmgreen, samparker, SjoerdMeijer, efriedma

Reviewed By: dmgreen, efriedma

Subscribers: efriedma, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60705

llvm-svn: 362470
2019-06-04 08:28:48 +00:00
Diogo N. Sampaio df92f84110 [ARM][FIX] Ran out of registers due tail recursion
Summary:
- pr42062
When compiling for MinSize,
ARMTargetLowering::LowerCall decides to indirect
multiple calls to a same function. However,
it disconsiders the limitation that thumb1
indirect calls require the callee to be in a
register from r0 to r3 (llvm limiation).
If all those registers are used by arguments, the
compiler dies with "error: run out of registers
during register allocation".
This patch tells the function
IsEligibleForTailCallOptimization if we intend to
perform indirect calls, as to avoid tail call
optimization.

Reviewers: dmgreen, efriedma

Reviewed By: efriedma

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62683

llvm-svn: 362366
2019-06-03 08:58:05 +00:00
Sam Parker 913604a637 [NFC][ARM][ParallelDSP] Refactor narrow sequence
Most of the code used for finding a 'narrow' sequence is not used,
so I've removed it and simplified the calls from the smlad matcher.

llvm-svn: 362104
2019-05-30 15:26:37 +00:00
Sjoerd Meijer eb072b5a6a [ARM] Change the MC names for VMAXNM/VMINNM
Now the NEON ones have a prefix "NEON_", and the VFP ones have a
prefix "VFP_". This is so that the regex in ARMScheduleA57.td can be
made to match both of _those_ classes of VMAXNM without also matching
the MVE ones that are going to be introduced soon. NFCI.

Patch by Simon Tatham.

Differential Revision: https://reviews.llvm.org/D60700

llvm-svn: 362097
2019-05-30 14:34:29 +00:00
Simon Pilgrim 5359bb4d31 [ARM] LowerVECTOR_SHUFFLE - fix uninitialized variable warnings. NFCI.
llvm-svn: 362094
2019-05-30 14:01:24 +00:00
Sjoerd Meijer 930dee2c0b [ARM] add target arch definitions for 8.1-M and MVE
This adds:
- LLVM subtarget features to make all the new instructions conditional on,
- CPU and FPU names for use on clang's command line, with default FPUs set
  so that "armv8.1-m.main+fp" and "armv8.1-m.main+fp.dp" will select the right
  FPU features,
- architecture extension names "mve" and "mve.fp",
- ABI build attribute support for v8.1-M (a new value for Tag_CPU_arch) and MVE
  (a new actual tag).

Patch mostly by Simon Tatham.

Differential Revision: https://reviews.llvm.org/D60698

llvm-svn: 362090
2019-05-30 12:57:04 +00:00
Sjoerd Meijer 7eb95d672d [ARM] Introduce separate features for FP registers
The MVE extension in Arm v8.1-M permits the use of some move, load and
store isntructions which access the FP registers, even if there's no
actual FP support in the processor (in particular, if you have the
integer-only version of MVE).

Therefore, we need separate subtarget features to condition those
instructions on, which are implied by both FP and MVE but are not part
of either.

Patch mostly by Simon Tatham.

Differential Revision: https://reviews.llvm.org/D60694

llvm-svn: 362088
2019-05-30 12:37:05 +00:00
Sjoerd Meijer 5857bf5d1e [ARM] Add an MVE execution domain
MVE architecturally specifies a 'beat' system in which a vector
instruction executed now will complete its actual operation over the
next four cycles, so it can overlap with the execution of the previous
and next MVE instruction.

This makes it generally an advantage to avoid moving values back and
forth between MVE registers and anywhere else, if there's any sensible
way to do the same processing in whatever register type the values
already occupied.

That's just what the 'execution domain' system is supposed to achieve.
So here we add a new execution domain which will contain all the MVE
vector instructions when they are added.

Patch by: Simon Tatham

Differential Revision: https://reviews.llvm.org/D60703

llvm-svn: 362068
2019-05-30 08:07:06 +00:00
Sjoerd Meijer 24c5629625 [ARM] Split predicates out into their own .td file
The new ARMPredicates.td is included from ARM.td, early enough that
the predicate definitions are already in scope when ARMSchedule.td is
included. This will make it possible to refer to them in
UnsupportedFeatures fields of scheduling models.

NFC: the chunk of Tablegen being moved here is copied and pasted
verbatim.

Patch by: Simon Tatham

Differential Revision: https://reviews.llvm.org/D60693

llvm-svn: 361958
2019-05-29 13:41:57 +00:00
Simon Tatham 760df47b77 [ARM] Replace fp-only-sp and d16 with fp64 and d32.
Those two subtarget features were awkward because their semantics are
reversed: each one indicates the _lack_ of support for something in
the architecture, rather than the presence. As a consequence, you
don't get the behavior you want if you combine two sets of feature
bits.

Each SubtargetFeature for an FP architecture version now comes in four
versions, one for each combination of those options. So you can still
say (for example) '+vfp2' in a feature string and it will mean what
it's always meant, but there's a new string '+vfp2d16sp' meaning the
version without those extra options.

A lot of this change is just mechanically replacing positive checks
for the old features with negative checks for the new ones. But one
more interesting change is that I've rearranged getFPUFeatures() so
that the main FPU feature is appended to the output list *before*
rather than after the features derived from the Restriction field, so
that -fp64 and -d32 can override defaults added by the main feature.

Reviewers: dmgreen, samparker, SjoerdMeijer

Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D60691

llvm-svn: 361845
2019-05-28 16:13:20 +00:00
Diana Picus 68b20c589c [ARM GlobalISel] Cleanup CallLowering a bit
We never actually use the Offsets produced by ComputeValueVTs, so remove
them until we need them.

llvm-svn: 361755
2019-05-27 10:30:33 +00:00
Alexander Timofeev ba447bae74 [AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign
             the correct register classes to the cross block values beforehand. For the divergent targets
             same value type requires different register classes dependent on the value divergence.

    Reviewers: rampitec, nhaehnle

    Differential Revision: https://reviews.llvm.org/D59990

    This commit was reverted because of the build failure.
    The reason was mlformed patch.
    Build failure fixed.

llvm-svn: 361741
2019-05-26 20:33:26 +00:00
David Green 0dbafe191e [ARM] Select fp16 fma
This adds a pattern for fma, similar to the float and double patterns.

Differential Revision: https://reviews.llvm.org/D62330

llvm-svn: 361719
2019-05-26 11:34:30 +00:00
David Green 21542cd6f4 [ARM] Select a number of fp16 rounding functions
This add patterns for fp16 round and ceil etc. Same as the float and double
patterns.

Differential Revision: https://reviews.llvm.org/D62326

llvm-svn: 361718
2019-05-26 11:13:00 +00:00
David Green c9f4b7d201 [ARM] Promote various fp16 math intrinsics
Promote a number of fp16 math intrinsics to float, so that the relevant float
math routines can be used. Copysign is expanded so as to be handled in-place.

Differential Revision: https://reviews.llvm.org/D62325

llvm-svn: 361717
2019-05-26 10:59:21 +00:00
David Green 2881325b17 [ARM] Select fp16 fabs
This adds a pattern for the fabs intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62324

llvm-svn: 361715
2019-05-26 10:51:58 +00:00
David Green aeade651f3 [ARM] Select fp16 fsqrt
This adds a pattern for the sqrt intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62322

llvm-svn: 361714
2019-05-26 10:42:24 +00:00
David Green caf8a11b65 [ARM] Promote fp16 frem
Promote fp16 frem operations on ARM to floats so they call fmodf.

Differential Revision: https://reviews.llvm.org/D62321

llvm-svn: 361713
2019-05-26 10:30:22 +00:00
Peter Collingbourne 3b93737446 Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."
Broke sanitizer bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio

llvm-svn: 361688
2019-05-25 01:52:38 +00:00
Nick Desaulniers 9f7bd71cf5 [ARM] additionally check for ARM::INLINEASM_BR w/ ARM::INLINEASM
Summary:
We were observing failures for arm32 allyesconfigs of the Linux kernel
with the asm goto Clang patch, where ldr's were being generated to
offsets too far away to encode in imm12.

It looks like since INLINEASM_BR was created off of INLINEASM, a few
checks for INLINEASM needed to be updated to check for either case.

pr/41999

Link: https://github.com/ClangBuiltLinux/linux/issues/490

Reviewers: peter.smith, kristof.beyls, ostannard, rengolin, t.p.northover

Reviewed By: peter.smith

Subscribers: jyu2, javed.absar, hiraditya, llvm-commits, nathanchance, craig.topper, kees, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62400

llvm-svn: 361659
2019-05-24 18:58:21 +00:00
Alexander Timofeev dffedea014 [AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign
         the correct register classes to the cross block values beforehand. For the divergent targets
         same value type requires different register classes dependent on the value divergence.

Reviewers: rampitec, nhaehnle

Differential Revision: https://reviews.llvm.org/D59990

llvm-svn: 361644
2019-05-24 15:32:18 +00:00
Sjoerd Meijer 937af54666 [ARM] ARMExpandPseudoInsts: add debug messages
This pass wasn't printing any messages at all, which I find really inconvenient
while debugging/tracing things. It now dumps the before and after of expanded
instructions. It doesn't do this yet for all instructions, but this is a good
start I guess.

Differential Revision: https://reviews.llvm.org/D62297

llvm-svn: 361604
2019-05-24 08:25:02 +00:00
Sam Parker 617cdc5a6d [ARM][CGP] Clear SafeWrap before each search
The previous patch added a member set to store instructions that we
could allow to wrap. But this wasn't cleared between searches meaning
that they could get promoted, incorrectly, during the promotion of a
separate valid chain.

Differential Revision: https://reviews.llvm.org/D62254

llvm-svn: 361462
2019-05-23 07:46:39 +00:00
Sam Parker 3141bbd52d [ARM][CGP] Skip nuw in PrepareConstants
PrepareConstants step converts add/sub with 'negative' immediates to
sub/add with a 'positive' imm to make promotion more simple. nuw
already states that the add shouldn't cause an unsigned wrap, so
it shouldn't need any tweaking. Plus, we also don't allow a sub with
a 'negative' immediate to be safe wrap, so this functionality has
been removed. The PrepareConstants step now just handles the add
instructions that we've determined would be safe if they wrap around
zero.

Differential Revision: https://reviews.llvm.org/D62057

llvm-svn: 361227
2019-05-21 07:56:47 +00:00
Fangrui Song 43ca0e9eb8 [ARM] Support .reloc *, R_ARM_NONE, *
R_ARM_NONE can be used to create references among sections. When
--gc-sections is used, the referenced section will be retained if the
origin section is retained.

Add a generic MCFixupKind FK_NONE as this kind of no-op relocation is
ubiquitous on ELF and COFF, and probably available on many other binary
formats. See D62014.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D61992

llvm-svn: 360980
2019-05-17 02:51:54 +00:00
David Green 0582b22f10 [ARM] Don't use the Machine Scheduler for cortex-m at minsize
The new cortex-m schedule in rL360768 helps performance, but can increase the
amount of high-registers used. This, on average, ends up increasing the
codesize by a fair amount (because less instructions are converted from T2 to
T1). On cortex-m at -Oz, where we are quite size-paranoid, it is better to use
the existing DAG scheduler with the RegPressure scheduling preference (at least
until the issues around T2 vs T1 instructions can be improved).

I have also made sure that the Sched::RegPressure dag scheduler is always
chosen for MinSize.

The test shows one case where we increase the number of registers used.

Differential Revision: https://reviews.llvm.org/D61882

llvm-svn: 360769
2019-05-15 12:58:02 +00:00
David Green d2d0f46cd2 [ARM] Cortex-M4 schedule
This patch adds a simple Cortex-M4 schedule, renaming the existing M3
schedule to M4 and filling in the latencies as-per the Cortex-M4 TRM:
https://developer.arm.com/docs/ddi0439/latest

Most of these are 1, with the important exception being loads taking 2
cycles. A few others are also higher, but I don't believe they make a
large difference. I've repurposed the M3 schedule as the latencies are
mostly the same between the two cores, with the M4 having more FP and
DSP instructions. We also turn on MISched and UseAA for the cores that
now use this.

It also adds some schedule Write's to various instruction to make things
simpler.

Differential Revision: https://reviews.llvm.org/D54142

llvm-svn: 360768
2019-05-15 12:41:58 +00:00
Richard Trieu f3011b9b10 [ARM] Create a TargetInfo header. NFC
Move the declarations of getThe<Name>Target() functions into a new header in
TargetInfo and make users of these functions include this new header.
This fixes a layering problem.

llvm-svn: 360718
2019-05-14 22:29:50 +00:00
Sam Parker a33e311a3b [ARM][ParallelDSP] Relax alias checks
When deciding the safety of generating smlad, we checked for any
writes within the block that may alias with any of the loads that
need to be widened. This is overly conservative because it only
matters when there's a potential aliasing write to a location
accessed by a pair of loads.

Now we check for aliasing writes only once, during setup. If two
loads are found to have an aliasing write between them, we don't add
these loads to LoadPairs. This means that later during the transform,
we can safely widened a pair without worrying about aliasing.

However, to maintain correctness, we also need to change the way that
wide loads are inserted because the order is now important.

The MatchSMLAD method has also been changed, absorbing
MatchReductions and AddMACCandidate to hopefully improve readability.

Differential Revision: https://reviews.llvm.org/D6102

llvm-svn: 360567
2019-05-13 09:23:32 +00:00
Richard Trieu 5e3ee4b84e [ARM] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc.  Merging them together will fix this.  For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.

llvm-svn: 360490
2019-05-11 00:34:07 +00:00
Sam Parker d7b650cc72 [ARM][CGP] Guard against signext args and sitofp
Add an Argument that has the SExtAttr attached, as well as SIToFP
instructions, as values that generate sign bits. SIToFP doesn't
strictly do this and could be treated as a sink to be sign-extended.

Differential Revision: https://reviews.llvm.org/D61381

llvm-svn: 360331
2019-05-09 11:56:16 +00:00
Diana Picus 3531453371 [ARM GlobalISel] Map DBG_VALUE for types != s32
...and make sure we fail elegantly for unsupported values.

s64 goes into DPR, anything <= 32 into GPR.

llvm-svn: 360321
2019-05-09 09:49:36 +00:00
Tim Northover 18adcf331b ARM: disallow SP as Rn for Thumb2 TST & TEQ instructions
Using SP in this position is unpredictable in ARMv7. CMP and CMN are not
affected, and of course v8 relaxes this requirement, but that's handled
elsewhere.

llvm-svn: 360242
2019-05-08 10:59:08 +00:00
Diana Picus 0a47fb8884 [ARM GlobalISel] Widen G_SELECT operands
...except for the condition operand.

llvm-svn: 360135
2019-05-07 11:39:30 +00:00
Diana Picus d6d3808fa4 [ARM GlobalISel] Widen G_INTTOPTR/G_PTRTOINT
We actually have a couple of G_PTRTOINT to s8 when building clang, so
we should do something about them.

llvm-svn: 360130
2019-05-07 10:48:01 +00:00