Commit Graph

2400 Commits

Author SHA1 Message Date
Krzysztof Parzyszek bea30c6286 Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.

Differential Revision: https://reviews.llvm.org/D36160

llvm-svn: 310619
2017-08-10 16:17:32 +00:00
Sam Parker 71a474d563 [AArch64] Assembler support for v8.3 RCpc
Added assembler and disassembler support for the new Release
Consistent processor consistent instructions, introduced with ARM
v8.3-A for AArch64.

Differential Revision: https://reviews.llvm.org/D36522

llvm-svn: 310575
2017-08-10 09:52:55 +00:00
Sam Parker 9d95764c3b [ARM][AArch64] ARMv8.3-A enablement
The beta ARMv8.3 ISA specifications have been released for AArch64
and AArch32, these can be found at:
https://developer.arm.com/products/architecture/a-profile/exploration-tools

An introduction to this architecture update can be found at:
https://community.arm.com/processors/b/blog/posts/armv8-a-architecture-2016-additions

This patch is the first in a series which will add ARM v8.3-A support
in LLVM and Clang. It adds the necessary changes that create targets
for both the ARM and AArch64 backends.

Differential Revision: https://reviews.llvm.org/D36514

llvm-svn: 310561
2017-08-10 09:41:00 +00:00
Sjoerd Meijer 7987633263 [AArch64] Assembler support for the ARMv8.2a dot product instructions
Dot product is an optional ARMv8.2a extension, see also the public architecture
specification here:
https://developer.arm.com/products/architecture/a-profile/exploration-tools.
This patch adds AArch64 assembler support for these dot product instructions.

Differential Revision: https://reviews.llvm.org/D36515

llvm-svn: 310480
2017-08-09 14:59:54 +00:00
Quentin Colombet 8dd90fb54b Revert "[GlobalISel] Remove the GISelAccessor API."
This reverts commit r310115.

It causes a linker failure for the one of the unittests of AArch64 on one
of the linux bot:
http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429

: && /home/fedora/gcc/install/gcc-7.1.0/bin/g++   -fPIC
-fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W
-Wno-unused-parameter -Wwrite-strings -Wcast-qual
-Wno-missing-field-initializers -pedantic -Wno-long-long
-Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment
-ffunction-sections -fdata-sections -O2
-L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined
-Wl,-O3 -Wl,--gc-sections
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o  -o
unittests/Target/AArch64/AArch64Tests
lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn
lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn
lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn
lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn
lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread
lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread
-Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib
&& :
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0):
undefined reference to `vtable for llvm::LegalizerInfo'
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8):
undefined reference to `vtable for llvm::RegisterBankInfo'

The particularity of this bot is that it is built with
BUILD_SHARED_LIBS=ON

However, I was not able to reproduce the problem so far.
Reverting to unblock the bot.

llvm-svn: 310425
2017-08-08 22:22:30 +00:00
Jessica Paquette d36945bf3a [MachineOutliner] Ensure AArch64 outliner doesn't mess with W30 or LR
Before, the outliner would mark all instructions that read from/modify LR as
illegal. This doesn't handle W30, which overlaps with LR. This shouldn't be
outlined.

This commit fixes that by making modifiesRegister() and readsRegister() look at
W30 + take in a TRI argument. This makes sure that modifiesRegister() and
readsRegister() won't outline either of W30 and LR.

https://reviews.llvm.org/D36435

llvm-svn: 310422
2017-08-08 21:51:26 +00:00
Daniel Sanders 0554004698 [globalisel][tablegen] Add support for importing 'imm' operands.
Summary:
This patch enables the import of rules containing 'imm' operands that do not
constrain the acceptable values using predicates. Support for ImmLeaf will
arrive in a later patch.

Depends on D35681

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35833

llvm-svn: 310343
2017-08-08 10:44:31 +00:00
Joel Jones 60711ca253 [AArch64] LSE Atomics reorg - part 1
Add memory synchronization semantics to LSE Atomics.

The memory semantics feature will be added in a subsequent patch.

In this patch, several corrections were added to the existing LSE Atomics
implementation, based on the ARM Errata D11904 from 05/12/2017.

Patch by: steleman

Differential Revision: https://reviews.llvm.org/D35319

llvm-svn: 310167
2017-08-05 04:30:55 +00:00
Quentin Colombet c046208c52 [GlobalISel] Remove the GISelAccessor API.
Its sole purpose was to avoid spreading around ifdefs related to
building global-isel. Since r309990, GlobalISel is not optional anymore,
thus, we can get rid of this mechanism all together.

NFC.

llvm-svn: 310115
2017-08-04 20:15:46 +00:00
Craig Topper 4e22ee6745 [ConstantInt] Use ConstantInt::getValue instead of Constant::getUniqueInteger in a few places where we obviously have a ConstantInt. NFC
getUniqueInteger will ultimately call ConstantInt::getValue, but calling ConstantInt::getValue should be inlined.

llvm-svn: 310069
2017-08-04 16:59:29 +00:00
Chad Rosier 14fc82a1df [AArch64] Fix an assertion for pre-index generation with unscaled loads/stores.
Differential Revision: https://reviews.llvm.org/D36248
PR34035

llvm-svn: 310066
2017-08-04 16:44:06 +00:00
Quentin Colombet 250e050a50 [GlobalISel] Make GlobalISel a non-optional library.
With this change, the GlobalISel library gets always built. In
particular, this is not possible to opt GlobalISel out of the build
using the LLVM_BUILD_GLOBAL_ISEL variable any more.

llvm-svn: 309990
2017-08-03 21:52:25 +00:00
Tim Northover 869fa74d4b Revert "[AArch64] Simplify AES*Tied pseudo expansion (NFC)."
This reverts commit r309821.

My suggestion was wrong because it left the MachineOperands tied which
confused the verifier. Since there's no easy way to untie operands, the
original BuildMI solution is probably best.

llvm-svn: 309962
2017-08-03 16:59:36 +00:00
Rafael Espindola 79e238afee Delete Default and JITDefault code models
IMHO it is an antipattern to have a enum value that is Default.

At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.

This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.

llvm-svn: 309911
2017-08-03 02:16:21 +00:00
Florian Hahn 31f78fd0ae [AArch64] Simplify AES*Tied pseudo expansion (NFC).
Summary:
Suggested by @t.p.northover in https://bugs.llvm.org/show_bug.cgi?id=34015.


Reviewers: javed.absar, t.p.northover, rengolin

Reviewed By: t.p.northover

Subscribers: aemerson, kristof.beyls, llvm-commits, t.p.northover

Differential Revision: https://reviews.llvm.org/D36223

llvm-svn: 309821
2017-08-02 15:17:19 +00:00
Haicheng Wu 50692a203c [AArch64] Fix a typo in isExtFreeImpl()
next => not

Differential Revision: https://reviews.llvm.org/D36104

llvm-svn: 309748
2017-08-01 21:26:45 +00:00
Martin Storsjo eacf4e408b [AArch64] Rewrite stack frame handling for win64 vararg functions
The previous attempt, which made do with a single offset in
computeCalleeSaveRegisterPairs, wasn't quite enough. The previous
attempt only worked as long as CombineSPBump == true (since the
offset would be adjusted later in fixupCalleeSaveRestoreStackOffset).

Instead include the size for the fixed stack area used for win64
varargs in calculations in emitPrologue/emitEpilogue. The stack
consists of mainly three parts;
- AFI->getLocalStackSize()
- AFI->getCalleeSavedStackSize()
- FixedObject

Most of the places in the code which previously used the CSStackSize
now use PrologueSaveSize instead, which is the sum of the latter
two, while some cases which need exactly the middle one use
AFI->getCalleeSavedStackSize() explicitly instead of a local variable.

In addition to moving the offsetting into emitPrologue/emitEpilogue
(which fixes functions with CombineSPBump == false), also set the
frame pointer to point to the right location, where the frame pointer
and link register actually are stored. In addition to the prologue/epilogue,
this also requires changes to resolveFrameIndexReference.

Add tests for a function that keeps a frame pointer and another one
that uses a VLA.

Differential Revision: https://reviews.llvm.org/D35919

llvm-svn: 309744
2017-08-01 21:13:54 +00:00
Aditya Nandakumar 02c602e18c [GISel]: Support Widening G_ICMP's destination operand.
Updated AArch64 to widen destination to s32.
https://reviews.llvm.org/D35737

Reviewed by Tim

llvm-svn: 309579
2017-07-31 17:00:16 +00:00
Florian Hahn f63a5e91db [AArch64] Tie source and destination operands for AESMC/AESIMC.
Summary:
Most CPUs implementing AES fusion require instruction pairs of the form
    AESE Vn, _
    AESMC Vn, Vn
and
    AESD Vn, _
    AESIMC Vn, Vn

The constraint is added to AES(I)MC instructions which use the result of
an AES(E|D) instruction by using AES(I)MCTrr pseudo instructions, which
constraint source and destination registers to be the same.

A nice side effect of this change is that now all possible pairs are
scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll
test case.

I had to update aes_load_store. The version I added initially was very
reduced and with the new constraint, AESE/AESMC could not be scheduled
back-to-back. I updated the test to be more realistic and still expose
the same scheduling problem as the initial test case.

Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga

Reviewed By: t.p.northover, evandro

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35299

llvm-svn: 309495
2017-07-29 20:35:28 +00:00
Florian Hahn 2f86e3d494 [AArch64] Use 8 bytes as preferred function alignment on Cortex-A53.
Summary:
This change gives a 0.25% speedup on execution time, a 0.82% improvement
in benchmark scores and a 0.20% increase in binary size on a Cortex-A53.
These numbers are the geomean results on a wide range of benchmarks from
the test-suite and a range of proprietary suites.

Reviewers: t.p.northover, aadg, silviu.baranga, mcrosier, rengolin

Reviewed By: rengolin

Subscribers: grimar, davide, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35568

llvm-svn: 309494
2017-07-29 20:04:54 +00:00
Jessica Paquette d87f54493d [MachineOutliner] NFC: Change IsTailCall to a call class + frame class
This commit

- Removes IsTailCall and replaces it with a target-defined unsigned
- Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall
- Adds a call class + frame class classification to OutlinedFunction and Candidate respectively

This accomplishes a couple things.

Firstly, we don't need the notion of *tail call* in the general outlining algorithm.

Secondly, we now can have different "outlining classes" for each candidate within a set of candidates.
This will make it easy to add new ways to outline sequences for certain targets and dynamically choose
an appropriate cost model for a sequence depending on the context that that sequence lives in.

Ultimately, this should get us closer to being able to do something like, say avoid saving the link
register when outlining AArch64 instructions.

llvm-svn: 309475
2017-07-29 02:55:46 +00:00
Tim Northover a7f583e33b GlobalISel: map 128-bit values to an FPR by default.
Eventually we may want to allow a pair of GPRs but absolutely nothing in the
entire world is ready for that yet.

llvm-svn: 309404
2017-07-28 17:11:01 +00:00
Joel Jones 08e88e8df7 [AArch64] Standardize suffixes for LSE Atomics mnemonics (NFCI)
This NFC changeset standardizes the suffixes used for LSE Atomics
instructions.

It changes the existing suffixes - 'b', 'h', 's', 'd' - to the existing
standard 'B', 'H', 'W' and 'X'.

This changeset is the result of the code review discussion for D35319.

Patch by: steleman

Differential Revision: https://reviews.llvm.org/D35927

llvm-svn: 309384
2017-07-28 14:09:24 +00:00
Jessica Paquette 809d708b8a [MachineOutliner] NFC: Split up getOutliningBenefit
This is some more cleanup in preparation for some actual
functional changes. This splits getOutliningBenefit into
two cost functions: getOutliningCallOverhead and
getOutliningFrameOverhead. These functions return the
number of instructions that would be required to call
a specific function and the number of instructions
that would be required to construct a frame for a
specific funtion. The actual outlining benefit logic
is moved into the outliner, which calls these functions.

The goal of refactoring getOutliningBenefit is to:

- Get us closer to getting rid of the IsTailCall flag

- Further split up "target-specific" things and
"general algorithm" things

llvm-svn: 309356
2017-07-28 03:21:58 +00:00
Adrian Prantl 8f4b353ee1 Remove unused function from AArch64 backend (NFC)
llvm-svn: 309336
2017-07-27 23:52:06 +00:00
Ahmed Bougacha 52cecb1f27 [AArch64] Remove outdated comment. NFC.
There hasn't been a ternary since r231987.

llvm-svn: 309324
2017-07-27 21:27:58 +00:00
Ahmed Bougacha 87807c5a86 [AArch64] Fix legality info passed to demanded bits for TBI opt.
The (seldom-used) TBI-aware optimization had a typo lying dormant since
it was first introduced, in r252573:  when asking for demanded bits, it
told TLI that it was running after legalize, where the opposite was
true.

This is an important piece of information, that the demanded bits
analysis uses to make assumptions about the node.  r301019 added such an
assumption, which was broken by the TBI combine.

Instead, pass the correct flags to TLO.

llvm-svn: 309323
2017-07-27 21:27:25 +00:00
Florian Hahn 67ddd1d08f [TargetParser] Use enum classes for various ARM kind enums.
Summary:
Using c++11 enum classes ensures that only valid enum values are used
for ArchKind, ProfileKind, VersionKind and ISAKind. This removes the
need for checks that the provided values map to a proper enum value,
allows us to get rid of AK_LAST and prevents comparing values from
different enums. It also removes a bunch of static_cast
from unsigned to enum values and vice versa, at the cost of introducing
static casts to access AArch64ARCHNames and ARMARCHNames by ArchKind.

FPUKind and ArchExtKind are the only remaining old-style enum in
TargetParser.h. I think it's beneficial to keep ArchExtKind as old-style
enum, but FPUKind can be converted too, but this patch is quite big, so
could do this in a follow-up patch. I could also split this patch up a
bit, if people would prefer that.

Reviewers: rengolin, javed.absar, chandlerc, rovka

Reviewed By: rovka

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35882

llvm-svn: 309287
2017-07-27 16:27:56 +00:00
Evandro Menezes d192a8ae7d [AArch64] Adjust the cost model for Exynos M1 and M2
Add the information for the scalar reciprocal square root approximation.

llvm-svn: 309183
2017-07-26 21:28:15 +00:00
Peter Collingbourne 081ffe2ff2 Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This was a use-after-free waiting to happen.

llvm-svn: 309159
2017-07-26 19:15:29 +00:00
Martin Storsjo 0b7bf7a2e3 [COFF, ARM64] Fix symbol offsets in ADRP/ADD/LDR/STR relocations
In COFF, a symbol offset can't be stored in the relocation (as is
done in ELF or MachO), but is stored as the immediate in the
instruction itself. The immediate in the ADRP thus is the symbol
offset in bytes, not in pages. For the PAGEOFFSET_12A/L relocations,
ignore any offset outside of the lowest 12 bits; they won't have any
effect on the ADD/LDR/STR instruction itself but only on the associated
ADRP.

This is similar to how the same issue is handled for MOVW/MOVT
instructions in ELF (see e.g. SVN r307713, and r307728 in lld).

This fixes "fixup out of range" errors while building larger object
files, where temporary symbols end up as a plain section symbol and
an offset, and fixes any cases where the symbol offset mean that
the actual target ended up on a different page than the symbol
itself.

Differential Revision: https://reviews.llvm.org/D35791

llvm-svn: 309105
2017-07-26 11:19:17 +00:00
Zvi Rackover 1b73682243 TargetLowering: Change isShuffleMaskLegal's mask argument type to ArrayRef<int>. NFCI.
Changing mask argument type from const SmallVectorImpl<int>& to
ArrayRef<int>.

This came up in D35700 where a mask is received as an ArrayRef<int> and
we want to pass it to TargetLowering::isShuffleMaskLegal().
Also saves a few lines of code.

llvm-svn: 309085
2017-07-26 08:06:58 +00:00
Eugene Zelenko 96d933da4f [AArch64] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 309062
2017-07-25 23:51:02 +00:00
Eric Christopher 97ae58686f Update the comments on default subtargets based on feedback.
llvm-svn: 309041
2017-07-25 22:21:08 +00:00
Martin Storsjo 8cb3667541 [AArch64] Reserve a 16 byte aligned amount of fixed stack for win64 varargs
Create a dummy 8 byte fixed object for the unused slot below the first
stored vararg.

Alternative ideas tested but skipped: One could try to align the whole
fixed object to 16, but I haven't found how to add an offset to the stack
frame used in LowerWin64_VASTART.

If only the size of the fixed stack object size is padded but not the offset, via
MFI.CreateFixedObject(alignTo(GPRSaveSize, 16), -(int)GPRSaveSize, false),
PrologEpilogInserter crashes due to "Attempted to reset backwards range!".

This fixes misconceptions about where registers are spilled, since
AArch64FrameLowering.cpp assumes the offset from fixed objects is
aligned to 16 bytes (and the Win64 case there already manually aligns
the offset to 16 bytes).

This fixes cases where local stack allocations could overwrite callee
saved registers on the stack.

Differential Revision: https://reviews.llvm.org/D35720

llvm-svn: 308950
2017-07-25 05:20:01 +00:00
Evandro Menezes 29ffb0e66a [AArch64] Adjust the cost model for Exynos M1 and M2
Fine tune the resources in a couple of ASIMD loads.

llvm-svn: 308904
2017-07-24 18:06:16 +00:00
Chad Rosier 9b2b4c961a [AArch64] Redundant Copy Elimination - remove more zero copies.
This patch removes unnecessary zero copies in BBs that are targets of b.eq/b.ne
and we know the result of the compare instruction is zero.  For example,

BB#0:
  subs w0, w1, w2
  str w0, [x1]
  b.ne .LBB0_2
BB#1:
  mov w0, wzr  ; <-- redundant
  str w0, [x2]
.LBB0_2

Differential Revision: https://reviews.llvm.org/D35075

llvm-svn: 308849
2017-07-23 16:38:08 +00:00
Jonas Paulsson 024e319489 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
Evandro Menezes 55459609c8 [AArch64] Adjust the cost model for Exynos M1 and M2
Add the cost for the EXT instructions and explicitly add the cost for a few
instructions that were implied by the coarse model.

llvm-svn: 308697
2017-07-20 23:41:50 +00:00
Tim Northover 7b6d66c0c9 Recommit: GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64.
It revealed a bug in the Localizer pass which has now been fixed.

This includes the fix for SUBREG_TO_REG committed separately last time.

llvm-svn: 308688
2017-07-20 22:58:38 +00:00
Mandeep Singh Grang d41ac895bb [COFF, ARM64, CodeView] Add support to emit CodeView debug info for ARM64 COFF
Reviewers: compnerd, ruiu, rnk, zturner

Reviewed By: rnk

Subscribers: majnemer, aemerson, aprantl, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35518

llvm-svn: 308665
2017-07-20 20:20:00 +00:00
Diana Picus 7534b28291 Revert "GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64."
This reverts commit 36c6a2ea9669bc3bb695928529a85d12d1d3e3f9 because it
broke the test-suite on the GlobalISel bot.

llvm-svn: 308603
2017-07-20 11:36:03 +00:00
Tim Northover 967d4aa7a0 GlobalISel: partially revert r308540.
An unfinished and untested implementation of ISel for G_UNMERGE_VALUES crept in
by mistake.

llvm-svn: 308542
2017-07-19 22:11:08 +00:00
Tim Northover 0e0b3c97dd GlobalISel: fix SUBREG_TO_REG implementation.
The first argument needs to be an immediate rather than a register. Should fix
some crashes in the verifier bot.

llvm-svn: 308540
2017-07-19 22:08:08 +00:00
Martin Storsjo b2e9fcfca4 [AArch64] Force relocations for all ADRP instructions
This generalizes an existing fix from ELF to MachO and COFF.

Test that an ADRP to a local symbol whose offset is known at assembly
time still produces relocations, both for MachO and COFF. Test that
an ADRP without a @page modifier on MachO fails (previously it
didn't).

Differential Revision: https://reviews.llvm.org/D35544

llvm-svn: 308518
2017-07-19 20:14:32 +00:00
Martin Storsjo 2ff5f5d681 [AArch64, COFF] Interpret .align as power of two for COFF as well
Differential Revision: https://reviews.llvm.org/D35545

llvm-svn: 308517
2017-07-19 20:14:24 +00:00
Tim Northover d59fbec8e2 GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64.
llvm-svn: 308493
2017-07-19 16:47:07 +00:00
Evandro Menezes e8411cba87 [AArch64] Adjust the feature set for Exynos M2
Add fusion of AES operations.

llvm-svn: 308388
2017-07-18 22:51:25 +00:00
Mandeep Singh Grang d857b4ca98 [COFF, ARM64] Reserve X18 register by default
Reviewers: compnerd, rnk, ruiu, mstorsjo

Reviewed By: mstorsjo

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35531

llvm-svn: 308358
2017-07-18 20:41:33 +00:00
Geoff Berry 9962faed2b [AArch64][Falkor] Avoid HW prefetcher tag collisions (step 2)
Summary:
Avoid HW prefetcher instruction tag collisions in loops by inserting
MOVs to change the base address register of strided loads.

Reviewers: t.p.northover, mcrosier

Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D35366

llvm-svn: 308324
2017-07-18 16:14:22 +00:00