Summary: This resolves the issue of tablegen-erated includes in the headers for non-GlobalISel builds in a simpler way than before.
Reviewers: qcolombet, ab
Reviewed By: ab
Subscribers: igorb, ab, mgorny, dberris, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30998
llvm-svn: 299637
A number of backends (AArch64, MIPS, ARM) have been using
MCContext::reportError to report issues such as out-of-range fixup values in
their TgtAsmBackend. This is great, but because MCContext couldn't easily be
threaded through to the adjustFixupValue helper function from its usual
callsite (applyFixup), these backends ended up adding an MCContext* argument
and adding another call to applyFixup to processFixupValue. Adding an
MCContext parameter to applyFixup makes this unnecessary, and even better -
applyFixup can take a reference to MCContext rather than a potentially null
pointer.
Differential Revision: https://reviews.llvm.org/D30264
llvm-svn: 299529
This improves upon r246462: that prevented FMOVs from being emitted
for the cross-class INSERT_SUBREGs by disabling the formation of
INSERT_SUBREGs of LOAD. But the ld1.s that we started selecting
caused us to introduce partial dependencies on the vector register.
Avoid that by using SCALAR_TO_VECTOR: it's a first-class citizen that
is folded away by many patterns, including the scalar LDRS that we
want in this case.
Credit goes to Adam for finding the issue!
llvm-svn: 299482
This mode is just like -mcmodel=small except that it moves the
thread pointer from TPIDR_EL0 to TPIDR_EL1.
Patch by Roland McGrath.
Differential Revision: https://reviews.llvm.org/D31624
llvm-svn: 299462
Summary:
Lift the restrictions that prevented the tree walking introduced in the
previous change and add support for patterns like:
(G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3
Also adds support for G_SEXT and G_ZEXT to support these cases.
One particular aspect of this that I should draw attention to is that I've
tried to be overly conservative in determining the safety of matches that
involve non-adjacent instructions and multiple basic blocks. This is intended
to be used as a cheap initial check and we may add a more expensive check in
the future. The current rules are:
* Reject if any instruction may load/store (we'd need to check for intervening
memory operations.
* Reject if any instruction has implicit operands.
* Reject if any instruction has unmodelled side-effects.
See isObviouslySafeToFold().
Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka
Reviewed By: ab
Subscribers: igorb, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30539
llvm-svn: 299430
Summary:
Move the aarch64-type-promotion pass within the existing type promotion framework in CGP.
This change also support forking sexts when a new sext is required for promotion.
Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853.
Reviewers: jmolloy, mcrosier, javed.absar, qcolombet
Reviewed By: qcolombet
Subscribers: llvm-commits, aemerson, rengolin, mcrosier
Differential Revision: https://reviews.llvm.org/D28680
llvm-svn: 299379
Summary:
This feature enables folding of logical shift operations of up to 3 places into addressing mode on Kryo and Falkor that have a fastpath LSL.
Reviewers: mcrosier, rengolin, t.p.northover
Subscribers: junbuml, gberry, llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D31113
llvm-svn: 299240
Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes.
Differential Revision: https://reviews.llvm.org/D31249
llvm-svn: 299201
Summary:
Similar to the ARM target in https://reviews.llvm.org/rL298380, this
patch adds identical infrastructure for disabling negative immediate
conversions, and converts the existing aliases to the new infrastucture.
Reviewers: rengolin, javed.absar, olista01, SjoerdMeijer, samparker
Reviewed By: samparker
Subscribers: samparker, aemerson, llvm-commits
Differential Revision: https://reviews.llvm.org/D31243
llvm-svn: 298908
We're not to the point of supporting the load/store patterns yet
(because they extensively use PatFrags).
But in the meantime, we can implement some of the simplest addressing
modes.
llvm-svn: 298863
CBZ/CBNZ represent a substantial portion of all conditional branches.
Look through G_ICMP to select them.
We can't use tablegen yet because the existing patterns match an
AArch64ISD node.
llvm-svn: 298856
Among other things, this allows Machine LICM to hoist a costly 'mrs'
instruction from within a loop.
Differential Revision: http://reviews.llvm.org/D31151
llvm-svn: 298851
When I tested r298734, I thought that red zones were enabled by default like in
X86. Since red zones are behind a flag on AArch64 the testing wasn't true.
llvm-svn: 298747
AArch64 doesn't require -mno-red-zone; stack fixups are sufficient here. This was
unnecessarily copied over from the X86 target.
(You can now outline with red zones! Yay!)
Removing the requirement passes all Single/MultiSource tests.
llvm-svn: 298734
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.
Rename AttributeSetImpl to AttributeListImpl to follow suit.
It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.
Reviewers: sanjoy, javed.absar, chandlerc, pete
Reviewed By: pete
Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits
Differential Revision: https://reviews.llvm.org/D31102
llvm-svn: 298393
Forgot to remove some output before committing last time. (Instruction fixups
don't actually overflow anywhere in the test suite so far, so I missed it).
To prevent the outliner from screaming "Overflow!" in the event that that
does happen, this commit removes that output.
llvm-svn: 298260
This commit adds a parameter that lets us pass in the calling convention
of the call to CallLowering::lowerCall. This allows us to handle
situations where the calling convetion of the callee is different from
that of the caller.
Differential Revision: https://reviews.llvm.org/D31039
llvm-svn: 298254
This commit adds the necessary target hooks for outlining in AArch64. It also
refactors the switch statement used in `getMemOpBaseRegImmOfsWidth` into a
more general function, `getMemOpInfo`. This allows the outliner to share that
code without copying and pasting it.
The AArch64 outliner can be run using -mllvm -enable-machine-outliner, as with
the X86-64 outliner.
The test for this pass verifies that the outliner does, in fact outline
functions, fixes up the stack accesses properly, and can correctly generate a
tail call. In the future, this test should be replaced with a MIR test, so that
we can properly test immediate offset overflows in fixed-up instructions.
llvm-svn: 298162
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.
In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).
Reviewed By: chandlerc
Differential Revision: https://reviews.llvm.org/D31052
llvm-svn: 298010
In fact this default implementation should be the only implementation,
keep it virtual for now to accomodate targets that don't model flags
correctly.
Differential Revision: https://reviews.llvm.org/D30747
llvm-svn: 297980
Earlier stages of GlobalISel always use ConstantInt in G_CONSTANT so that's
what we should check for.
This fixes a crash introduced in r297782.
llvm-svn: 297968
Summary:
Adds a new kind of MachineOperand: MO_Placeholder.
This operand must not appear in the MIR and only exists as a way of
creating an 'uninitialized' operand until a matcher function overwrites it.
Depends on D30046, D29712
Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet
Reviewed By: qcolombet
Subscribers: dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D30089
llvm-svn: 297782
Recommiting with compiler time improvements
Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.
* Simplify Consecutive Merge Store Candidate Search
Now that address aliasing is much less conservative, push through
simplified store merging search and chain alias analysis which only
checks for parallel stores through the chain subgraph. This is cleaner
as the separation of non-interfering loads/stores from the
store-merging logic.
When merging stores search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited.
This improves the quality of the output SelectionDAG and the output
Codegen (save perhaps for some ARM cases where we correctly constructs
wider loads, but then promotes them to float operations which appear
but requires more expensive constant generation).
Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)
Additional Minor Changes:
1. Finishes removing unused AliasLoad code
2. Unifies the chain aggregation in the merged stores across code
paths
3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.
4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seems sufficient to not cause regressions in
tests.
5. Remove Chain dependencies of Memory operations on CopyfromReg
nodes as these are captured by data dependence
6. Forward loads-store values through tokenfactors containing
{CopyToReg,CopyFromReg} Values.
7. Peephole to convert buildvector of extract_vector_elt to
extract_subvector if possible (see
CodeGen/AArch64/store-merge.ll)
8. Store merging for the ARM target is restricted to 32-bit as
some in some contexts invalid 64-bit operations are being
generated. This can be removed once appropriate checks are
added.
This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.
Many tests required some changes as memory operations are now
reorderable, improving load-store forwarding. One test in
particular is worth noting:
CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
forwarding converts a load-store pair into a parallel store and
a memory-realized bitcast of the same value. However, because we
lose the sharing of the explicit and implicit store values we
must create another local store. A similar transformation
happens before SelectionDAG as well.
Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle
llvm-svn: 297695
Summary:
A53 scheduler causes an assertion failure on all CRC instructions:
include/llvm/CodeGen/MachineInstr.h:280: const llvm::MachineOperand
&llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i <
getNumOperands() && "getOperand() out of range!"' failed.
The case statements corresponding to CRC instructions are incorrect and should
be removed.
Also adding a testcase while on this.
Reviewers: t.p.northover, javed.absar, apazos, rengolin
Reviewed By: rengolin
Subscribers: evandro, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D30274
llvm-svn: 297582
In order to make it easier to parse information about the performance of
MacroFusion, this patch adds the function and the instruction names to the
debug output of this pass.
llvm-svn: 297504
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.
Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.
The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h.
Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar
Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30046
llvm-svn: 297241