Commit Graph

2143 Commits

Author SHA1 Message Date
Ahmed Bougacha f75782f9dc [GlobalISel][AArch64] Fold FI into LDR/STR ui addressing mode.
A majority of loads and stores at O0 access an alloca.

It's trivial to fold the G_FRAME_INDEX into the instruction; do it.

llvm-svn: 298864
2017-03-27 17:31:56 +00:00
Ahmed Bougacha 8a654085d0 [GlobalISel][AArch64] Fold G_GEP into LDR/STR ui addressing mode.
We're not to the point of supporting the load/store patterns yet
(because they extensively use PatFrags).

But in the meantime, we can implement some of the simplest addressing
modes.

llvm-svn: 298863
2017-03-27 17:31:52 +00:00
Ahmed Bougacha 85a66a6d9f [GlobalISel][AArch64] Select store of zero to WZR/XZR.
These occur very frequently, and are quite trivial to catch.

llvm-svn: 298862
2017-03-27 17:31:48 +00:00
Ahmed Bougacha 641cb203b6 [GlobalISel][AArch64] Select CBZ.
CBZ/CBNZ represent a substantial portion of all conditional branches.
Look through G_ICMP to select them.

We can't use tablegen yet because the existing patterns match an
AArch64ISD node.

llvm-svn: 298856
2017-03-27 16:35:31 +00:00
Chad Rosier 862a41270f [AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as not having side effects.
Among other things, this allows Machine LICM to hoist a costly 'mrs'
instruction from within a loop.

Differential Revision: http://reviews.llvm.org/D31151

llvm-svn: 298851
2017-03-27 15:52:38 +00:00
Davide Italiano a2c4e4b929 [Target] Remove some code probably copy/pasted from another backend.
llvm-svn: 298825
2017-03-26 21:45:04 +00:00
Davide Italiano 5c2aa5d3e4 [MachineScheduler] Reference the correct header.
llvm-svn: 298823
2017-03-26 21:27:21 +00:00
Balaram Makam cf0e5e1c62 [AArch64] Refine Falkor Machine Model - Part1
llvm-svn: 298768
2017-03-25 04:02:39 +00:00
Jessica Paquette eac8633d6d [Outliner] Revert r298734.
When I tested r298734, I thought that red zones were enabled by default like in
X86. Since red zones are behind a flag on AArch64 the testing wasn't true.

llvm-svn: 298747
2017-03-24 23:00:21 +00:00
Jessica Paquette 167af85ec7 [Outliner] Remove no red zone requirment for AArch64
AArch64 doesn't require -mno-red-zone; stack fixups are sufficient here. This was
unnecessarily copied over from the X86 target.

(You can now outline with red zones! Yay!)

Removing the requirement passes all Single/MultiSource tests.

llvm-svn: 298734
2017-03-24 20:47:59 +00:00
Matt Arsenault 18bb24a1be TTI: Split IsSimple in MemIntrinsicInfo
All this did before was assert in EarlyCSE.

llvm-svn: 298724
2017-03-24 18:56:43 +00:00
Davide Italiano 0145e751c4 [AArch64] Drive-by cleanup, make this code shorter. NFCI.
llvm-svn: 298563
2017-03-22 23:37:58 +00:00
Reid Kleckner b518054b87 Rename AttributeSet to AttributeList
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

llvm-svn: 298393
2017-03-21 16:57:19 +00:00
Jessica Paquette 02cbfb2926 [Outliner] ACTUALLY remove the errs output
I don't know how to type. This fixes the last commit which would have made all
of the overflows legal, and kept the screaming.

llvm-svn: 298263
2017-03-20 16:25:04 +00:00
Jessica Paquette 5d59a4ee19 [Outliner] Remove output for offset range check
Forgot to remove some output before committing last time. (Instruction fixups
don't actually overflow anywhere in the test suite so far, so I missed it).

To prevent the outliner from screaming "Overflow!" in the event that that
does happen, this commit removes that output.

llvm-svn: 298260
2017-03-20 15:51:45 +00:00
Diana Picus d79253a9f7 [GlobalISel] Use the correct calling conv for calls
This commit adds a parameter that lets us pass in the calling convention
of the call to CallLowering::lowerCall. This allows us to handle
situations where the calling convetion of the callee is different from
that of the caller.

Differential Revision: https://reviews.llvm.org/D31039

llvm-svn: 298254
2017-03-20 14:40:18 +00:00
Nirav Dave ac6081cb67 Make library calls sensitive to regparm module flag (Fixes PR3997).
Reviewers: mkuper, rnk

Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D27050

llvm-svn: 298179
2017-03-18 00:44:07 +00:00
Nirav Dave 6de2c77944 Capitalize ArgListEntry fields. NFC.
llvm-svn: 298178
2017-03-18 00:43:57 +00:00
Jessica Paquette ea8cc09be0 [Outliner] Add outliner for AArch64
This commit adds the necessary target hooks for outlining in AArch64. It also
refactors the switch statement used in `getMemOpBaseRegImmOfsWidth` into a
more general function, `getMemOpInfo`. This allows the outliner to share that
code without copying and pasting it.

The AArch64 outliner can be run using -mllvm -enable-machine-outliner, as with
the X86-64 outliner.

The test for this pass verifies that the outliner does, in fact outline
functions, fixes up the stack accesses properly, and can correctly generate a
tail call. In the future, this test should be replaced with a MIR test, so that
we can properly test immediate offset overflows in fixed-up instructions.

llvm-svn: 298162
2017-03-17 22:26:55 +00:00
Chad Rosier a69dcb6b66 [AArch64] Use alias analysis in the load/store optimization pass.
This allows the optimization to rearrange loads and stores more aggressively.

Differential Revision: http://reviews.llvm.org/D30903

llvm-svn: 298092
2017-03-17 14:19:55 +00:00
Reid Kleckner 45707d4d5a Remove getArgumentList() in favor of arg_begin(), args(), etc
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.

In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).

Reviewed By: chandlerc

Differential Revision: https://reviews.llvm.org/D31052

llvm-svn: 298010
2017-03-16 22:59:15 +00:00
Matthias Braun e959544517 TargetInstrInfo: Provide default implementation of isTailCall().
In fact this default implementation should be the only implementation,
keep it virtual for now to accomodate targets that don't model flags
correctly.

Differential Revision: https://reviews.llvm.org/D30747

llvm-svn: 297980
2017-03-16 20:02:30 +00:00
Daniel Sanders 0e64202871 [globalisel] Correct G_CONSTANT path of selectArithImmed()
Earlier stages of GlobalISel always use ConstantInt in G_CONSTANT so that's
what we should check for.

This fixes a crash introduced in r297782.

llvm-svn: 297968
2017-03-16 18:04:50 +00:00
Ahmed Bougacha 62cd73d989 [GlobalISel][AArch64] Select ADDXri.
We're now able to select ADDWri thanks to the new complex pattern
support.  Extend that to ADDXri.

llvm-svn: 297874
2017-03-15 19:20:59 +00:00
Daniel Sanders a228df75c0 [globalisel] LLVM_BUILD_GLOBAL_ISEL=OFF should prevent GlobalISel instruction selector from being declared.
llvm-svn: 297786
2017-03-14 22:09:29 +00:00
Daniel Sanders 8a4bae9993 [globalisel][tblgen] Add support for ComplexPatterns
Summary:
Adds a new kind of MachineOperand: MO_Placeholder.
This operand must not appear in the MIR and only exists as a way of
creating an 'uninitialized' operand until a matcher function overwrites it.

Depends on D30046, D29712

Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet

Reviewed By: qcolombet

Subscribers: dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D30089

llvm-svn: 297782
2017-03-14 21:32:08 +00:00
Nirav Dave 54e22f33d9 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Recommiting with compiler time improvements

    Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 297695
2017-03-14 00:34:14 +00:00
Balaram Makam cacc08bb46 [AArch64] Map Sched Read/Write resources for Falkor.
llvm-svn: 297611
2017-03-13 10:42:17 +00:00
Azharuddin Mohammed 473b75c3d5 Remove CRC32 instructions from AArch64InstrInfo::hasShiftedReg
Summary:
A53 scheduler causes an assertion failure on all CRC instructions:
include/llvm/CodeGen/MachineInstr.h:280: const llvm::MachineOperand
&llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i <
getNumOperands() && "getOperand() out of range!"' failed.

The case statements corresponding to CRC instructions are incorrect and should
be removed.

Also adding a testcase while on this.

Reviewers: t.p.northover, javed.absar, apazos, rengolin

Reviewed By: rengolin

Subscribers: evandro, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30274

llvm-svn: 297582
2017-03-12 14:02:32 +00:00
Evandro Menezes 8f70e249a7 [AArch64, X86] Additional debug information for MacroFusion
In order to make it easier to parse information about the performance of
MacroFusion, this patch adds the function and the instruction names to the
debug output of this pass.

llvm-svn: 297504
2017-03-10 20:20:04 +00:00
Daniel Sanders 52b4ce727a Recommit: [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.

Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.

The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h.

Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar

Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30046

llvm-svn: 297241
2017-03-07 23:20:35 +00:00
Joel Jones 2852088126 [AArch64] Vulcan is now ThunderXT99
Broadcom Vulcan is now Cavium ThunderX2T99.

LLVM Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32113

Minor fixes for the alignments of loops and functions for
ThunderX T81/T83/T88 (better performance).

Patch was tested with SpecCPU2006.

Patch by Stefan Teleman

Differential Revision: https://reviews.llvm.org/D30510

llvm-svn: 297190
2017-03-07 19:42:40 +00:00
Daniel Sanders 8ebec37d26 Revert r297177: Change LLT constructor string into an LLT-based object ...
More module problems. This time it only showed up in the stage 2 compile of
clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile.

Somehow, this change causes the build to need Attributes.gen before it's been
generated.

llvm-svn: 297188
2017-03-07 19:21:23 +00:00
Daniel Sanders 8612326a08 [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.

Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.

Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar

Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30046

llvm-svn: 297177
2017-03-07 18:32:25 +00:00
Tim Northover c2c545b8f7 GlobalISel: restrict G_EXTRACT instruction to just one operand.
A bit more painful than G_INSERT because it was more widely used, but this
should simplify the handling of extract operations in most locations.

llvm-svn: 297100
2017-03-06 23:50:28 +00:00
Chad Rosier 9a70c7c02a [AArch64][Redundant Copy Elim] Add support for CMN and shifted imm.
This patch extends the current functionality of the AArch64 redundant copy
elimination pass to handle CMN instructions as well as a shifted
immediates.

Differential Revision: https://reviews.llvm.org/D30576.

llvm-svn: 297078
2017-03-06 21:20:00 +00:00
Tim Northover 3e6a7afd81 GlobalISel: constrain G_INSERT to inserting just one value per instruction.
It's much easier to reason about single-value inserts and no-one was actually
using the variadic variants before.

llvm-svn: 296923
2017-03-03 23:05:47 +00:00
Chandler Carruth ce52b80744 [SDAG] Revert r296476 (and r296486, r296668, r296690).
This patch causes compile times for some patterns to explode. I have
a (large, unreduced) test case that slows down by more than 20x and
several test cases slow down by 2x. I'm sending some of the test cases
directly to Nirav and following up with more details in the review log,
but this should unblock anyone else hitting this.

llvm-svn: 296862
2017-03-03 10:02:25 +00:00
Sjoerd Meijer 69bccf96bd [AArch64AsmParser] rewrite of function parseSysAlias
This is a cleanup/rewrite of the parseSysAlias function. It was not using the
tablegen instruction descriptions, but was “manually” matching the mnemonics
and recreating the operands whereas all this information is already in
tablegen; all this code has been replaced with calls to lookupXYZByName
tablegen calls.

Differential Revision: https://reviews.llvm.org/D30491

llvm-svn: 296857
2017-03-03 08:12:47 +00:00
Chad Rosier ea25eca04a [AArch64] Extend redundant copy elimination pass to handle non-zero stores.
This patch extends the current functionality of the AArch64 redundant copy
elimination pass to handle non-zero cases such as:

BB#0:
  cmp x0, #1
  b.eq .LBB0_1
.LBB0_1:
  orr x0, xzr, #0x1  ; <-- redundant copy; x0 known to hold #1.

Differential Revision: https://reviews.llvm.org/D29344

llvm-svn: 296809
2017-03-02 20:48:11 +00:00
Tim Northover e80d6d1360 GlobalISel: record correct stack usage for signext parameters.
The CallingConv.td rules allocate 8 bytes for these kinds of arguments
on AAPCS targets, but we were only recording the smaller amount. The
difference is theoretical on AArch64 because we don't actually store
more than the smaller amount, but it's still much better to have these
two components in agreement.

Based on Diana Picus's ARM equivalent patch (where it matters a lot
more).

llvm-svn: 296754
2017-03-02 15:34:18 +00:00
Matthew Simpson aee9771ae2 [ARM/AArch64] Update costs for interleaved accesses with wide types
After r296750, we're able to match interleaved accesses having types wider than
128 bits. This patch updates the associated TTI costs.

Differential Revision: https://reviews.llvm.org/D29675

llvm-svn: 296751
2017-03-02 15:15:35 +00:00
Matthew Simpson 1bfa159db9 [ARM/AArch64] Support wide interleaved accesses
This patch teaches (ARM|AArch64)ISelLowering.cpp to match illegal vector types
to interleaved access intrinsics as long as the types are multiples of the
vector register width. A "wide" access will now be mapped to multiple
interleave intrinsics similar to the way in which non-interleaved accesses with
illegal types are legalized into multiple accesses. I'll update the associated
TTI costs (in getInterleavedMemoryOpCost) as a follow-on.

Differential Revision: https://reviews.llvm.org/D29466

llvm-svn: 296750
2017-03-02 15:11:20 +00:00
Ahmed Bougacha 120ae22d70 [GlobalISel] Add a way for targets to enable GISel.
Until now, we've had to use -global-isel to enable GISel.  But using
that on other targets that don't support it will result in an abort, as we
can't build a full pipeline.
Additionally, we want to experiment with enabling GISel by default for
some targets: we can't just enable GISel by default, even among those
target that do have some support, because the level of support varies.

This first step adds an override for the target to explicitly define its
level of support.  For AArch64, do that using
a new command-line option (I know..):
  -aarch64-enable-global-isel-at-O=<N>
Where N is the opt-level below which GISel should be used.

Default that to -1, so that we still don't enable GISel anywhere.
We're not there yet!

While there, remove a couple LLVM_UNLIKELYs.  Building the pipeline is
such a cold path that in practice that shouldn't matter at all.

llvm-svn: 296710
2017-03-01 23:33:08 +00:00
Daniel Sanders 983c9b98e9 Revert r296474 - [globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it.
There's a circular dependency that's only revealed when LLVM_ENABLE_MODULES=1.

llvm-svn: 296478
2017-02-28 15:00:27 +00:00
Nirav Dave f830dec3f2 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 296476
2017-02-28 14:24:15 +00:00
Daniel Sanders a5afdefec6 [globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.

Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.

Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar

Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30046

llvm-svn: 296474
2017-02-28 14:21:31 +00:00
Sjoerd Meijer 32ecac7ac8 AArch64InstPrinter: rewrite of printSysAlias
This is a cleanup/rewrite of the printSysAlias function. This was not using the
tablegen instruction descriptions, but was "manually" decoding the
instructions. This has been replaced with calls to lookup_XYZ_ByEncoding
tablegen calls.

This revealed several problems. First, instruction IVAU had the wrong encoding.
This was cancelled out by the parser that incorrectly matched the wrong
encoding. Second, instruction CVAP was missing from the SystemOperands tablegen
descriptions, so this has been added. And third, the required target features
were not captured in the tablegen descriptions, so support for this has also
been added.

Differential Revision: https://reviews.llvm.org/D30329

llvm-svn: 296343
2017-02-27 14:45:34 +00:00
Sjoerd Meijer 6d171006f4 AArch64AsmParser: don't try to parse “[1]” for non-vector register operands
There are no instructions that have "[1]" as part of the assembly string;
FMOVXDhighr is out of date. This removes dead code.

Differential Revision: https://reviews.llvm.org/D30165

llvm-svn: 296327
2017-02-27 10:51:11 +00:00
Nirav Dave 73cd0194cf Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r296252 until 256-bit operations are more efficiently generated in X86.

llvm-svn: 296279
2017-02-26 01:27:32 +00:00