2019-08-14 06:14:37 +08:00
|
|
|
// RUN: llvm-tblgen -gen-global-isel -I %p/../../include -I %p/Common -optimize-match-table=false %s -o %T/non-optimized.cpp
|
|
|
|
// RUN: llvm-tblgen -gen-global-isel -I %p/../../include -I %p/Common -optimize-match-table=true %s -o %T/optimized.cpp
|
|
|
|
// RUN: llvm-tblgen -gen-global-isel -I %p/../../include -I %p/Common %s -o %T/default.cpp
|
2018-05-22 07:28:51 +08:00
|
|
|
|
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R19C,R19N -input-file=%T/non-optimized.cpp
|
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R19C,R19O -input-file=%T/optimized.cpp
|
|
|
|
|
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R21C,R21N -input-file=%T/non-optimized.cpp
|
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R21C,R21O -input-file=%T/optimized.cpp
|
|
|
|
|
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R20C,R20N -input-file=%T/non-optimized.cpp
|
2018-05-23 10:04:19 +08:00
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R20C,R20O -input-file=%T/optimized.cpp
|
|
|
|
|
2018-05-22 07:28:51 +08:00
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R00C,R00N -input-file=%T/non-optimized.cpp
|
2018-05-23 10:04:19 +08:00
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R00C,R00O -input-file=%T/optimized.cpp
|
|
|
|
|
2018-05-22 07:28:51 +08:00
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R01C,R01N -input-file=%T/non-optimized.cpp
|
2018-05-23 10:04:19 +08:00
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R01C,R01O -input-file=%T/optimized.cpp
|
|
|
|
|
2018-05-22 07:28:51 +08:00
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R02C,R02N,NOOPT -input-file=%T/non-optimized.cpp
|
2018-05-23 10:04:19 +08:00
|
|
|
// RUN: FileCheck %s --check-prefixes=CHECK,R02C,R02O -input-file=%T/optimized.cpp
|
2018-05-22 07:28:51 +08:00
|
|
|
|
|
|
|
// RUN: diff %T/default.cpp %T/optimized.cpp
|
2017-02-04 08:47:05 +08:00
|
|
|
|
|
|
|
include "llvm/Target/Target.td"
|
2019-08-14 06:14:37 +08:00
|
|
|
include "GlobalISelEmitterCommon.td"
|
2017-02-04 08:47:05 +08:00
|
|
|
|
|
|
|
//===- Define the necessary boilerplate for our test target. --------------===//
|
|
|
|
|
2017-07-06 16:12:20 +08:00
|
|
|
let TargetPrefix = "mytarget" in {
|
|
|
|
def int_mytarget_nop : Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem]>;
|
|
|
|
}
|
|
|
|
|
[globalisel][tablegen] Fix patterns involving multiple ComplexPatterns.
Summary:
Temporaries are now allocated to operands instead of predicates and this
allocation is used to correctly pair up the rendered operands with the
matched operands.
Previously, ComplexPatterns were allocated temporaries independently in the
Src Pattern and Dst Pattern, leading to mismatches. Additionally, the Dst
Pattern failed to account for the allocated index and therefore always used
temporary 0, 1, ... when it should have used base+0, base+1, ...
Thanks to Aditya Nandakumar for noticing the bug.
Depends on D30539
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: igorb, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D31054
llvm-svn: 299538
2017-04-05 21:14:03 +08:00
|
|
|
def complex : Operand<i32>, ComplexPattern<i32, 2, "SelectComplexPattern", []> {
|
|
|
|
let MIOperandInfo = (ops i32imm, i32imm);
|
|
|
|
}
|
|
|
|
def gi_complex :
|
[globalisel][tablegen] Revise API for ComplexPattern operands to improve flexibility.
Summary:
Some targets need to be able to do more complex rendering than just adding an
operand or two to an instruction. For example, it may need to insert an
instruction to extract a subreg first, or it may need to perform an operation
on the operand.
In SelectionDAG, targets would create SDNode's to achieve the desired effect
during the complex pattern predicate. This worked because SelectionDAG had a
form of garbage collection that would take care of SDNode's that were created
but not used due to a later predicate rejecting a match. This doesn't translate
well to GlobalISel and the churn was wasteful.
The API changes in this patch enable GlobalISel to accomplish the same thing
without the waste. The API is now:
InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const;
where Root is the root of the match. The return value can be omitted to
indicate that the predicate failed to match, or a function with the signature
ComplexRendererFn can be returned. For example:
return OptionalComplexRendererFn(
[=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); });
adds two immediate operands to the rendered instruction. Immed and ShVal are
captured from the predicate function.
As an added bonus, this also reduces the amount of information we need to
provide to GIComplexOperandMatcher.
Depends on D31418
Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar
Reviewed By: ab
Subscribers: dberris, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D31761
llvm-svn: 301079
2017-04-22 23:11:04 +08:00
|
|
|
GIComplexOperandMatcher<s32, "selectComplexPattern">,
|
[globalisel][tablegen] Fix patterns involving multiple ComplexPatterns.
Summary:
Temporaries are now allocated to operands instead of predicates and this
allocation is used to correctly pair up the rendered operands with the
matched operands.
Previously, ComplexPatterns were allocated temporaries independently in the
Src Pattern and Dst Pattern, leading to mismatches. Additionally, the Dst
Pattern failed to account for the allocated index and therefore always used
temporary 0, 1, ... when it should have used base+0, base+1, ...
Thanks to Aditya Nandakumar for noticing the bug.
Depends on D30539
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: igorb, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D31054
llvm-svn: 299538
2017-04-05 21:14:03 +08:00
|
|
|
GIComplexPatternEquiv<complex>;
|
2017-10-16 02:22:54 +08:00
|
|
|
def complex_rr : Operand<i32>, ComplexPattern<i32, 2, "SelectComplexPatternRR", []> {
|
|
|
|
let MIOperandInfo = (ops GPR32, GPR32);
|
|
|
|
}
|
|
|
|
def gi_complex_rr :
|
|
|
|
GIComplexOperandMatcher<s32, "selectComplexPatternRR">,
|
|
|
|
GIComplexPatternEquiv<complex_rr>;
|
[globalisel][tablegen] Fix patterns involving multiple ComplexPatterns.
Summary:
Temporaries are now allocated to operands instead of predicates and this
allocation is used to correctly pair up the rendered operands with the
matched operands.
Previously, ComplexPatterns were allocated temporaries independently in the
Src Pattern and Dst Pattern, leading to mismatches. Additionally, the Dst
Pattern failed to account for the allocated index and therefore always used
temporary 0, 1, ... when it should have used base+0, base+1, ...
Thanks to Aditya Nandakumar for noticing the bug.
Depends on D30539
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: igorb, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D31054
llvm-svn: 299538
2017-04-05 21:14:03 +08:00
|
|
|
|
2018-01-17 02:44:05 +08:00
|
|
|
def cimm8_xform : SDNodeXForm<imm, [{
|
|
|
|
uint64_t Val = N->getZExtValue() << 1;
|
|
|
|
return CurDAG->getTargetConstant(Val, SDLoc(N), MVT::i64);
|
|
|
|
}]>;
|
|
|
|
|
|
|
|
def cimm8 : Operand<i32>, ImmLeaf<i32, [{return isInt<8>(Imm);}], cimm8_xform>;
|
|
|
|
|
2020-08-27 23:15:33 +08:00
|
|
|
def gi_cimm8 : GICustomOperandRenderer<"renderImm">,
|
2018-01-17 02:44:05 +08:00
|
|
|
GISDNodeXFormEquiv<cimm8_xform>;
|
|
|
|
|
2020-08-27 23:15:33 +08:00
|
|
|
def gi_cimm9 : GICustomOperandRenderer<"renderImm">;
|
|
|
|
|
[globalisel][tablegen] Add experimental support for OperandWithDefaultOps, PredicateOperand, and OptionalDefOperand
Summary:
As far as instruction selection is concerned, all three appear to be same thing.
Support for these operands is experimental since AArch64 doesn't make use
of them and the in-tree targets that do use them (AMDGPU for
OperandWithDefaultOps, AMDGPU/ARM/Hexagon/Lanai for PredicateOperand, and ARM
for OperandWithDefaultOps) are not using tablegen-erated GlobalISel yet.
Reviewers: rovka, aditya_nandakumar, t.p.northover, qcolombet, ab
Reviewed By: rovka
Subscribers: inglorion, aemerson, rengolin, mehdi_amini, dberris, kristof.beyls, igorb, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D31135
llvm-svn: 300037
2017-04-12 16:23:08 +08:00
|
|
|
def m1 : OperandWithDefaultOps <i32, (ops (i32 -1))>;
|
|
|
|
def Z : OperandWithDefaultOps <i32, (ops R0)>;
|
|
|
|
def m1Z : OperandWithDefaultOps <i32, (ops (i32 -1), R0)>;
|
|
|
|
|
[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).
Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.
Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab
Reviewed By: rovka
Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D31418
llvm-svn: 300993
2017-04-21 23:59:56 +08:00
|
|
|
def HasA : Predicate<"Subtarget->hasA()">;
|
|
|
|
def HasB : Predicate<"Subtarget->hasB()">;
|
2017-04-30 01:30:09 +08:00
|
|
|
def HasC : Predicate<"Subtarget->hasC()"> { let RecomputePerFunction = 1; }
|
[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).
Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.
Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab
Reviewed By: rovka
Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D31418
llvm-svn: 300993
2017-04-21 23:59:56 +08:00
|
|
|
|
|
|
|
//===- Test the function boilerplate. -------------------------------------===//
|
|
|
|
|
[globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.
The following patches will expand on this further to fully fix the regressions.
Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33758
llvm-svn: 307079
2017-07-04 22:35:06 +08:00
|
|
|
// CHECK: const unsigned MAX_SUBTARGET_PREDICATES = 3;
|
|
|
|
// CHECK: using PredicateBitset = llvm::PredicateBitsetImpl<MAX_SUBTARGET_PREDICATES>;
|
|
|
|
|
|
|
|
// CHECK-LABEL: #ifdef GET_GLOBALISEL_TEMPORARIES_DECL
|
|
|
|
// CHECK-NEXT: mutable MatcherState State;
|
2017-10-21 04:55:29 +08:00
|
|
|
// CHECK-NEXT: typedef ComplexRendererFns(MyTargetInstructionSelector::*ComplexMatcherMemFn)(MachineOperand &) const;
|
2020-11-06 22:19:59 +08:00
|
|
|
// CHECK-NEXT: typedef void(MyTargetInstructionSelector::*CustomRendererFn)(MachineInstrBuilder &, const MachineInstr &, int) const;
|
2018-01-17 02:44:05 +08:00
|
|
|
// CHECK-NEXT: const ISelInfoTy<PredicateBitset, ComplexMatcherMemFn, CustomRendererFn> ISelInfo;
|
2017-10-16 11:36:29 +08:00
|
|
|
// CHECK-NEXT: static MyTargetInstructionSelector::ComplexMatcherMemFn ComplexPredicateFns[];
|
2018-01-17 02:44:05 +08:00
|
|
|
// CHECK-NEXT: static MyTargetInstructionSelector::CustomRendererFn CustomRenderers[];
|
2017-12-20 22:41:51 +08:00
|
|
|
// CHECK-NEXT: bool testImmPredicate_I64(unsigned PredicateID, int64_t Imm) const override;
|
|
|
|
// CHECK-NEXT: bool testImmPredicate_APInt(unsigned PredicateID, const APInt &Imm) const override;
|
|
|
|
// CHECK-NEXT: bool testImmPredicate_APFloat(unsigned PredicateID, const APFloat &Imm) const override;
|
2018-05-03 04:07:15 +08:00
|
|
|
// CHECK-NEXT: const int64_t *getMatchTable() const override;
|
2020-09-14 16:39:25 +08:00
|
|
|
// CHECK-NEXT: bool testMIPredicate_MI(unsigned PredicateID, const MachineInstr &MI, const std::array<const MachineOperand *, 3> &Operands) const override;
|
[globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.
The following patches will expand on this further to fully fix the regressions.
Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33758
llvm-svn: 307079
2017-07-04 22:35:06 +08:00
|
|
|
// CHECK-NEXT: #endif // ifdef GET_GLOBALISEL_TEMPORARIES_DECL
|
|
|
|
|
|
|
|
// CHECK-LABEL: #ifdef GET_GLOBALISEL_TEMPORARIES_INIT
|
|
|
|
// CHECK-NEXT: , State(2),
|
2018-05-22 07:28:51 +08:00
|
|
|
// CHECK-NEXT: ISelInfo(TypeObjects, NumTypeObjects, FeatureBitsets, ComplexPredicateFns, CustomRenderers)
|
[globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.
The following patches will expand on this further to fully fix the regressions.
Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33758
llvm-svn: 307079
2017-07-04 22:35:06 +08:00
|
|
|
// CHECK-NEXT: #endif // ifdef GET_GLOBALISEL_TEMPORARIES_INIT
|
|
|
|
|
[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).
Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.
Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab
Reviewed By: rovka
Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D31418
llvm-svn: 300993
2017-04-21 23:59:56 +08:00
|
|
|
// CHECK-LABEL: enum SubtargetFeatureBits : uint8_t {
|
|
|
|
// CHECK-NEXT: Feature_HasABit = 0,
|
|
|
|
// CHECK-NEXT: Feature_HasBBit = 1,
|
2017-04-30 01:30:09 +08:00
|
|
|
// CHECK-NEXT: Feature_HasCBit = 2,
|
[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).
Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.
Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab
Reviewed By: rovka
Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D31418
llvm-svn: 300993
2017-04-21 23:59:56 +08:00
|
|
|
// CHECK-NEXT: };
|
|
|
|
|
|
|
|
// CHECK-LABEL: PredicateBitset MyTargetInstructionSelector::
|
2017-04-30 01:30:09 +08:00
|
|
|
// CHECK-NEXT: computeAvailableModuleFeatures(const MyTargetSubtarget *Subtarget) const {
|
[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).
Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.
Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab
Reviewed By: rovka
Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D31418
llvm-svn: 300993
2017-04-21 23:59:56 +08:00
|
|
|
// CHECK-NEXT: PredicateBitset Features;
|
|
|
|
// CHECK-NEXT: if (Subtarget->hasA())
|
2019-08-24 23:02:44 +08:00
|
|
|
// CHECK-NEXT: Features.set(Feature_HasABit);
|
[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).
Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.
Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab
Reviewed By: rovka
Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D31418
llvm-svn: 300993
2017-04-21 23:59:56 +08:00
|
|
|
// CHECK-NEXT: if (Subtarget->hasB())
|
2019-08-24 23:02:44 +08:00
|
|
|
// CHECK-NEXT: Features.set(Feature_HasBBit);
|
[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).
Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.
Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab
Reviewed By: rovka
Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D31418
llvm-svn: 300993
2017-04-21 23:59:56 +08:00
|
|
|
// CHECK-NEXT: return Features;
|
|
|
|
// CHECK-NEXT: }
|
2017-02-04 08:47:05 +08:00
|
|
|
|
2017-04-30 01:30:09 +08:00
|
|
|
// CHECK-LABEL: PredicateBitset MyTargetInstructionSelector::
|
|
|
|
// CHECK-NEXT: computeAvailableFunctionFeatures(const MyTargetSubtarget *Subtarget, const MachineFunction *MF) const {
|
|
|
|
// CHECK-NEXT: PredicateBitset Features;
|
|
|
|
// CHECK-NEXT: if (Subtarget->hasC())
|
2019-08-24 23:02:44 +08:00
|
|
|
// CHECK-NEXT: Features.set(Feature_HasCBit);
|
2017-04-30 01:30:09 +08:00
|
|
|
// CHECK-NEXT: return Features;
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
|
2017-08-23 18:09:25 +08:00
|
|
|
// CHECK-LABEL: // LLT Objects.
|
|
|
|
// CHECK-NEXT: enum {
|
Add support for pointer types in patterns
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
2019-02-21 03:43:47 +08:00
|
|
|
// CHECK-NEXT: GILLT_p0s32
|
2017-08-17 21:18:35 +08:00
|
|
|
// CHECK-NEXT: GILLT_s32,
|
|
|
|
// CHECK-NEXT: }
|
Add support for pointer types in patterns
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
2019-02-21 03:43:47 +08:00
|
|
|
// CHECK-NEXT: const static size_t NumTypeObjects = 2;
|
2017-08-17 21:18:35 +08:00
|
|
|
// CHECK-NEXT: const static LLT TypeObjects[] = {
|
Add support for pointer types in patterns
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
2019-02-21 03:43:47 +08:00
|
|
|
// CHECK-NEXT: LLT::pointer(0, 32),
|
2017-08-17 21:18:35 +08:00
|
|
|
// CHECK-NEXT: LLT::scalar(32),
|
|
|
|
// CHECK-NEXT: };
|
|
|
|
|
2017-08-23 18:09:25 +08:00
|
|
|
// CHECK-LABEL: // Feature bitsets.
|
|
|
|
// CHECK-NEXT: enum {
|
|
|
|
// CHECK-NEXT: GIFBS_Invalid,
|
|
|
|
// CHECK-NEXT: GIFBS_HasA,
|
|
|
|
// CHECK-NEXT: GIFBS_HasA_HasB_HasC,
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
// CHECK-NEXT: const static PredicateBitset FeatureBitsets[] {
|
|
|
|
// CHECK-NEXT: {}, // GIFBS_Invalid
|
|
|
|
// CHECK-NEXT: {Feature_HasABit, },
|
|
|
|
// CHECK-NEXT: {Feature_HasABit, Feature_HasBBit, Feature_HasCBit, },
|
|
|
|
// CHECK-NEXT: };
|
|
|
|
|
|
|
|
// CHECK-LABEL: // ComplexPattern predicates.
|
|
|
|
// CHECK-NEXT: enum {
|
|
|
|
// CHECK-NEXT: GICP_Invalid,
|
|
|
|
// CHECK-NEXT: GICP_gi_complex,
|
2017-10-16 02:22:54 +08:00
|
|
|
// CHECK-NEXT: GICP_gi_complex_rr,
|
2017-08-23 18:09:25 +08:00
|
|
|
// CHECK-NEXT: };
|
|
|
|
|
2017-08-24 17:11:20 +08:00
|
|
|
// CHECK-LABEL: // PatFrag predicates.
|
|
|
|
// CHECK-NEXT: enum {
|
2018-01-17 02:44:05 +08:00
|
|
|
// CHECK-NEXT: GIPFP_I64_Predicate_cimm8 = GIPFP_I64_Invalid + 1,
|
|
|
|
// CHECK-NEXT: GIPFP_I64_Predicate_simm8,
|
2017-08-24 17:11:20 +08:00
|
|
|
// CHECK-NEXT: };
|
2018-01-17 02:44:05 +08:00
|
|
|
|
|
|
|
|
|
|
|
// CHECK-NEXT: bool MyTargetInstructionSelector::testImmPredicate_I64(unsigned PredicateID, int64_t Imm) const {
|
|
|
|
// CHECK-NEXT: switch (PredicateID) {
|
|
|
|
// CHECK-NEXT: case GIPFP_I64_Predicate_cimm8: {
|
|
|
|
// CHECK-NEXT: return isInt<8>(Imm);
|
|
|
|
// CHECK-NEXT: llvm_unreachable("ImmediateCode should have returned");
|
|
|
|
// CHECK-NEXT: return false;
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
// CHECK-NEXT: case GIPFP_I64_Predicate_simm8: {
|
2017-12-20 22:41:51 +08:00
|
|
|
// CHECK-NEXT: return isInt<8>(Imm);
|
2018-01-17 02:44:05 +08:00
|
|
|
// CHECK-NEXT: llvm_unreachable("ImmediateCode should have returned");
|
|
|
|
// CHECK-NEXT: return false;
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
// CHECK-NEXT: llvm_unreachable("Unknown predicate");
|
|
|
|
// CHECK-NEXT: return false;
|
|
|
|
// CHECK-NEXT: }
|
2017-08-24 17:11:20 +08:00
|
|
|
|
[globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
2017-10-14 05:28:03 +08:00
|
|
|
// CHECK-LABEL: // PatFrag predicates.
|
|
|
|
// CHECK-NEXT: enum {
|
|
|
|
// CHECK-NEXT: GIPFP_APFloat_Predicate_fpimmz = GIPFP_APFloat_Invalid + 1,
|
|
|
|
// CHECK-NEXT: };
|
2017-12-20 22:41:51 +08:00
|
|
|
// CHECK-NEXT: bool MyTargetInstructionSelector::testImmPredicate_APFloat(unsigned PredicateID, const APFloat & Imm) const {
|
|
|
|
// CHECK-NEXT: switch (PredicateID) {
|
|
|
|
// CHECK-NEXT: case GIPFP_APFloat_Predicate_fpimmz: {
|
|
|
|
// CHECK-NEXT: return Imm->isExactlyValue(0.0);
|
|
|
|
// CHECK-NEXT: llvm_unreachable("ImmediateCode should have returned");
|
|
|
|
// CHECK-NEXT: return false;
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
// CHECK-NEXT: llvm_unreachable("Unknown predicate");
|
|
|
|
// CHECK-NEXT: return false;
|
|
|
|
// CHECK-NEXT: }
|
[globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
2017-10-14 05:28:03 +08:00
|
|
|
|
|
|
|
// CHECK-LABEL: // PatFrag predicates.
|
|
|
|
// CHECK-NEXT: enum {
|
|
|
|
// CHECK-NEXT: GIPFP_APInt_Predicate_simm9 = GIPFP_APInt_Invalid + 1,
|
|
|
|
// CHECK-NEXT: };
|
2017-12-20 22:41:51 +08:00
|
|
|
// CHECK-NEXT: bool MyTargetInstructionSelector::testImmPredicate_APInt(unsigned PredicateID, const APInt & Imm) const {
|
|
|
|
// CHECK-NEXT: switch (PredicateID) {
|
|
|
|
// CHECK-NEXT: case GIPFP_APInt_Predicate_simm9: {
|
|
|
|
// CHECK-NEXT: return isInt<9>(Imm->getSExtValue());
|
|
|
|
// CHECK-NEXT: llvm_unreachable("ImmediateCode should have returned");
|
|
|
|
// CHECK-NEXT: return false;
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
// CHECK-NEXT: llvm_unreachable("Unknown predicate");
|
|
|
|
// CHECK-NEXT: return false;
|
|
|
|
// CHECK-NEXT: }
|
[globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
2017-10-14 05:28:03 +08:00
|
|
|
|
2017-10-16 11:36:29 +08:00
|
|
|
// CHECK-LABEL: MyTargetInstructionSelector::ComplexMatcherMemFn
|
|
|
|
// CHECK-NEXT: MyTargetInstructionSelector::ComplexPredicateFns[] = {
|
|
|
|
// CHECK-NEXT: nullptr, // GICP_Invalid
|
|
|
|
// CHECK-NEXT: &MyTargetInstructionSelector::selectComplexPattern, // gi_complex
|
|
|
|
// CHECK-NEXT: &MyTargetInstructionSelector::selectComplexPatternRR, // gi_complex_rr
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
|
2018-01-17 02:44:05 +08:00
|
|
|
// CHECK-LABEL: // Custom renderers.
|
|
|
|
// CHECK-NEXT: enum {
|
|
|
|
// CHECK-NEXT: GICR_Invalid,
|
2020-08-27 23:15:33 +08:00
|
|
|
// CHECK-NEXT: GICR_renderImm,
|
2018-01-17 02:44:05 +08:00
|
|
|
// CHECK-NEXT: };
|
|
|
|
// CHECK-NEXT: MyTargetInstructionSelector::CustomRendererFn
|
|
|
|
// CHECK-NEXT: MyTargetInstructionSelector::CustomRenderers[] = {
|
2020-01-09 07:57:44 +08:00
|
|
|
// CHECK-NEXT: nullptr, // GICR_Invalid
|
2020-08-27 23:15:33 +08:00
|
|
|
// CHECK-NEXT: &MyTargetInstructionSelector::renderImm,
|
2018-01-17 02:44:05 +08:00
|
|
|
// CHECK-NEXT: };
|
|
|
|
|
[globalisel][tablegen] Generate rule coverage and use it to identify untested rules
Summary:
This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.
This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.
Usage:
* Enable LLVM_ENABLE_GISEL_COV in CMake
* Build the compiler and run some tests
* cat gisel-coverage-[0-9]* > gisel-coverage-all
* Delete lib/Target/*/*GenGlobalISel.inc*
* Build the compiler
Known issues:
* ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
step due to a lack of a portable 'cat' command. It should be the
concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
* There's no mechanism to discard coverage information when the ruleset
changes
Depends on D39742
Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka
Reviewed By: rovka
Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D39747
llvm-svn: 318356
2017-11-16 08:46:35 +08:00
|
|
|
// CHECK: bool MyTargetInstructionSelector::selectImpl(MachineInstr &I, CodeGenCoverage &CoverageInfo) const {
|
2017-07-05 22:50:18 +08:00
|
|
|
// CHECK-NEXT: MachineFunction &MF = *I.getParent()->getParent();
|
|
|
|
// CHECK-NEXT: MachineRegisterInfo &MRI = MF.getRegInfo();
|
|
|
|
// CHECK-NEXT: const PredicateBitset AvailableFeatures = getAvailableFeatures();
|
|
|
|
// CHECK-NEXT: NewMIVector OutMIs;
|
|
|
|
// CHECK-NEXT: State.MIs.clear();
|
|
|
|
// CHECK-NEXT: State.MIs.push_back(&I);
|
2017-02-04 08:47:05 +08:00
|
|
|
|
2018-05-03 04:07:15 +08:00
|
|
|
// CHECK: if (executeMatchTable(*this, OutMIs, State, ISelInfo, getMatchTable(), TII, MRI, TRI, RBI, AvailableFeatures, CoverageInfo)) {
|
|
|
|
// CHECK-NEXT: return true;
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
|
2018-05-22 06:21:24 +08:00
|
|
|
// CHECK: const int64_t *
|
|
|
|
// CHECK-LABEL: MyTargetInstructionSelector::getMatchTable() const {
|
2018-05-22 07:28:51 +08:00
|
|
|
// CHECK-NEXT: MatchTable0[] = {
|
|
|
|
|
|
|
|
//===- Test a pattern with multiple ComplexPatterns in multiple instrs ----===//
|
|
|
|
//
|
[GlobalISel][InstructionSelect] Switching MatchTable over opcodes, perf patch 4
This patch continues a series of patches started by r332907 (reapplied
as r332917)
In this commit we introduce a new matching opcode GIM_SwitchOpcode
that implements a jump table over opcodes and start emitting them for
root instructions.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 20% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64.
To some degree, we assume here that the opcodes form a dense set,
which is true at the moment for all upstream targets given the
limitations of our rule importing mechanism.
It might not be true for out of tree targets, specifically due to
pseudo's. If so, we might noticeably increase the size of the
MatchTable with this patch due to padding zeros. This will be
addressed later.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333017
2018-05-23 03:37:59 +08:00
|
|
|
// R19O-NEXT: GIM_SwitchOpcode, /*MI*/0, /*[*/{{[0-9]+}}, {{[0-9]+}}, /*)*//*default:*//*Label [[DEFAULT_NUM:[0-9]+]]*/ [[DEFAULT:[0-9]+]],
|
|
|
|
// R19O-NEXT: /*TargetOpcode::G_ADD*//*Label [[CASE_ADD_NUM:[0-9]+]]*/ [[CASE_ADD:[0-9]+]],
|
|
|
|
// R19O: /*TargetOpcode::G_SELECT*//*Label [[CASE_SELECT_NUM:[0-9]+]]*/ [[CASE_SELECT:[0-9]+]],
|
|
|
|
// R19O: // Label [[CASE_ADD_NUM]]: @[[CASE_ADD]]
|
|
|
|
// R19O: // Label [[CASE_SELECT_NUM]]: @[[CASE_SELECT]]
|
2018-05-23 10:04:19 +08:00
|
|
|
// R19O-NEXT: GIM_Try, /*On fail goto*//*Label [[GROUP_NUM:[0-9]+]]*/ [[GROUP:[0-9]+]],
|
|
|
|
// R19O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
[GlobalISel][InstructionSelect] Maximizing # of Group's common conditions, perf patch 8
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we greedily stuff 2nd level GroupMatcher's common
conditions with as many predicates as possible. This is purely
post-processing and it doesn't change which rules are put into the
groups in the first place: that decision is made by looking at the
first common predicate only.
The compile time improvements are minor and well within error margin,
however, it's highly improbable that this transformation could
pessimize performance, thus I'm still committing it for potential
gains for targets not implementing GlobalISel yet and out of tree
targets.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333139
2018-05-24 06:50:53 +08:00
|
|
|
// R19O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// R19O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// R19O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/3, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// R19C-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
2018-06-16 07:13:43 +08:00
|
|
|
// R19O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R19O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-22 12:31:50 +08:00
|
|
|
// R19N-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/4,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19N-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_SELECT,
|
|
|
|
// R19N-NEXT: // MIs[0] dst
|
2018-05-23 10:04:19 +08:00
|
|
|
// R19N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R19N-NEXT: // MIs[0] src1
|
2018-05-24 03:16:59 +08:00
|
|
|
// R19N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
2020-09-14 17:37:14 +08:00
|
|
|
// R19N-NEXT: // MIs[0] complex_rr:src2a:src2b
|
2018-05-24 03:16:59 +08:00
|
|
|
// R19N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19N-NEXT: GIM_CheckComplexPattern, /*MI*/0, /*Op*/2, /*Renderer*/0, GICP_gi_complex_rr,
|
2018-05-24 03:16:59 +08:00
|
|
|
// R19N-NEXT: // MIs[0] Operand 3
|
|
|
|
// R19N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/3, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19C-NEXT: GIM_RecordInsn, /*DefineMI*/1, /*MI*/0, /*OpIdx*/3, // MIs[1]
|
2018-05-22 12:31:50 +08:00
|
|
|
// R19N-NEXT: GIM_CheckNumOperands, /*MI*/1, /*Expected*/4,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19C-NEXT: GIM_CheckOpcode, /*MI*/1, TargetOpcode::G_SELECT,
|
2018-05-22 12:31:50 +08:00
|
|
|
// R19N-NEXT: // MIs[1] Operand 0
|
|
|
|
// R19N-NEXT: GIM_CheckType, /*MI*/1, /*Op*/0, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19N-NEXT: // MIs[1] src3
|
|
|
|
// R19C-NEXT: GIM_CheckType, /*MI*/1, /*Op*/1, /*Type*/GILLT_s32,
|
2018-05-24 03:16:59 +08:00
|
|
|
// R19O-NEXT: GIM_CheckType, /*MI*/1, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// R19O-NEXT: GIM_CheckType, /*MI*/1, /*Op*/3, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19N-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R19N-NEXT: // MIs[1] src4
|
2018-05-24 03:16:59 +08:00
|
|
|
// R19N-NEXT: GIM_CheckType, /*MI*/1, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19N-NEXT: GIM_CheckComplexPattern, /*MI*/1, /*Op*/2, /*Renderer*/1, GICP_gi_complex,
|
2020-09-14 17:37:14 +08:00
|
|
|
// R19N-NEXT: // MIs[1] complex:src5a:src5b
|
2018-05-24 03:16:59 +08:00
|
|
|
// R19N-NEXT: GIM_CheckType, /*MI*/1, /*Op*/3, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19N-NEXT: GIM_CheckComplexPattern, /*MI*/1, /*Op*/3, /*Renderer*/2, GICP_gi_complex,
|
2018-05-24 07:58:10 +08:00
|
|
|
// R19O-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R19C-NEXT: GIM_CheckIsSafeToFold, /*InsnID*/1,
|
|
|
|
// R19O-NEXT: GIM_CheckComplexPattern, /*MI*/0, /*Op*/2, /*Renderer*/0, GICP_gi_complex_rr,
|
|
|
|
// R19O-NEXT: GIM_CheckComplexPattern, /*MI*/1, /*Op*/2, /*Renderer*/1, GICP_gi_complex,
|
|
|
|
// R19O-NEXT: GIM_CheckComplexPattern, /*MI*/1, /*Op*/3, /*Renderer*/2, GICP_gi_complex,
|
|
|
|
// R19C-NEXT: // (select:{ *:[i32] } GPR32:{ *:[i32] }:$src1, (complex_rr:{ *:[i32] } GPR32:{ *:[i32] }:$src2a, GPR32:{ *:[i32] }:$src2b), (select:{ *:[i32] } GPR32:{ *:[i32] }:$src3, complex:{ *:[i32] }:$src4, (complex:{ *:[i32] } i32imm:{ *:[i32] }:$src5a, i32imm:{ *:[i32] }:$src5b))) => (INSN3:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2b, GPR32:{ *:[i32] }:$src2a, (INSN4:{ *:[i32] } GPR32:{ *:[i32] }:$src3, complex:{ *:[i32] }:$src4, i32imm:{ *:[i32] }:$src5a, i32imm:{ *:[i32] }:$src5b))
|
|
|
|
// R19C-NEXT: GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s32,
|
|
|
|
// R19C-NEXT: GIR_BuildMI, /*InsnID*/1, /*Opcode*/MyTarget::INSN4,
|
|
|
|
// R19C-NEXT: GIR_AddTempRegister, /*InsnID*/1, /*TempRegID*/0, /*TempRegFlags*/RegState::Define,
|
|
|
|
// R19C-NEXT: GIR_Copy, /*NewInsnID*/1, /*OldInsnID*/1, /*OpIdx*/1, // src3
|
|
|
|
// R19C-NEXT: GIR_ComplexRenderer, /*InsnID*/1, /*RendererID*/1,
|
|
|
|
// R19C-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/1, /*RendererID*/2, /*SubOperand*/0, // src5a
|
|
|
|
// R19C-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/1, /*RendererID*/2, /*SubOperand*/1, // src5b
|
|
|
|
// R19C-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/1,
|
|
|
|
// R19C-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::INSN3,
|
|
|
|
// R19C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// R19C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src1
|
|
|
|
// R19C-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/0, /*SubOperand*/1, // src2b
|
|
|
|
// R19C-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/0, /*SubOperand*/0, // src2a
|
|
|
|
// R19C-NEXT: GIR_AddTempRegister, /*InsnID*/0, /*TempRegID*/0, /*TempRegFlags*/0,
|
|
|
|
// R19C-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// R19C-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// R19C-NEXT: // GIR_Coverage, 19,
|
|
|
|
// R19C-NEXT: GIR_Done,
|
|
|
|
// R19C-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
|
|
|
//
|
2018-05-23 10:04:19 +08:00
|
|
|
// R19O: // Label [[GROUP_NUM]]: @[[GROUP]]
|
|
|
|
// R19O-NEXT: GIM_Reject,
|
[GlobalISel][InstructionSelect] Switching MatchTable over opcodes, perf patch 4
This patch continues a series of patches started by r332907 (reapplied
as r332917)
In this commit we introduce a new matching opcode GIM_SwitchOpcode
that implements a jump table over opcodes and start emitting them for
root instructions.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 20% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64.
To some degree, we assume here that the opcodes form a dense set,
which is true at the moment for all upstream targets given the
limitations of our rule importing mechanism.
It might not be true for out of tree targets, specifically due to
pseudo's. If so, we might noticeably increase the size of the
MatchTable with this patch due to padding zeros. This will be
addressed later.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333017
2018-05-23 03:37:59 +08:00
|
|
|
// R19O: // Label [[DEFAULT_NUM]]: @[[DEFAULT]]
|
|
|
|
// R19O-NEXT: GIM_Reject,
|
|
|
|
// R19O-NEXT: };
|
2017-07-11 18:40:18 +08:00
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
def INSN3 : I<(outs GPR32:$dst),
|
|
|
|
(ins GPR32Op:$src1, GPR32:$src2a, GPR32:$src2b, GPR32:$scr), []>;
|
|
|
|
def INSN4 : I<(outs GPR32:$scr),
|
|
|
|
(ins GPR32:$src3, complex:$src4, i32imm:$src5a, i32imm:$src5b), []>;
|
|
|
|
def : Pat<(select GPR32:$src1, (complex_rr GPR32:$src2a, GPR32:$src2b),
|
|
|
|
(select GPR32:$src3,
|
|
|
|
complex:$src4,
|
|
|
|
(complex i32imm:$src5a, i32imm:$src5b))),
|
|
|
|
(INSN3 GPR32:$src1, GPR32:$src2b, GPR32:$src2a,
|
|
|
|
(INSN4 GPR32:$src3, complex:$src4, i32imm:$src5a,
|
|
|
|
i32imm:$src5b))>;
|
2017-02-04 08:47:05 +08:00
|
|
|
|
[GlobalISel][InstructionSelect] Switching MatchTable over opcodes, perf patch 4
This patch continues a series of patches started by r332907 (reapplied
as r332917)
In this commit we introduce a new matching opcode GIM_SwitchOpcode
that implements a jump table over opcodes and start emitting them for
root instructions.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 20% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64.
To some degree, we assume here that the opcodes form a dense set,
which is true at the moment for all upstream targets given the
limitations of our rule importing mechanism.
It might not be true for out of tree targets, specifically due to
pseudo's. If so, we might noticeably increase the size of the
MatchTable with this patch due to padding zeros. This will be
addressed later.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333017
2018-05-23 03:37:59 +08:00
|
|
|
// R21O-NEXT: GIM_SwitchOpcode, /*MI*/0, /*[*/{{[0-9]+}}, {{[0-9]+}}, /*)*//*default:*//*Label [[DEFAULT_NUM:[0-9]+]]*/ [[DEFAULT:[0-9]+]],
|
|
|
|
// R21O-NEXT: /*TargetOpcode::G_ADD*//*Label [[CASE_ADD_NUM:[0-9]+]]*/ [[CASE_ADD:[0-9]+]],
|
|
|
|
// R21O: /*TargetOpcode::G_SELECT*//*Label [[CASE_SELECT_NUM:[0-9]+]]*/ [[CASE_SELECT:[0-9]+]],
|
|
|
|
// R21O: // Label [[CASE_ADD_NUM]]: @[[CASE_ADD]]
|
|
|
|
// R21O: // Label [[CASE_SELECT_NUM]]: @[[CASE_SELECT]]
|
2018-05-23 10:04:19 +08:00
|
|
|
// R21O-NEXT: GIM_Try, /*On fail goto*//*Label [[GROUP_NUM:[0-9]+]]*/ [[GROUP:[0-9]+]],
|
|
|
|
// R21O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
[GlobalISel][InstructionSelect] Maximizing # of Group's common conditions, perf patch 8
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we greedily stuff 2nd level GroupMatcher's common
conditions with as many predicates as possible. This is purely
post-processing and it doesn't change which rules are put into the
groups in the first place: that decision is made by looking at the
first common predicate only.
The compile time improvements are minor and well within error margin,
however, it's highly improbable that this transformation could
pessimize performance, thus I'm still committing it for potential
gains for targets not implementing GlobalISel yet and out of tree
targets.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333139
2018-05-24 06:50:53 +08:00
|
|
|
// R21O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// R21O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// R21O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/3, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// R21C-NEXT: GIM_Try, /*On fail goto*//*Label [[PREV_NUM:[0-9]+]]*/ [[PREV:[0-9]+]], // Rule ID 19 //
|
|
|
|
// R21C-NOT: GIR_Done,
|
|
|
|
// R21C: // GIR_Coverage, 19,
|
|
|
|
// R21C-NEXT: GIR_Done,
|
|
|
|
// R21C-NEXT: // Label [[PREV_NUM]]: @[[PREV]]
|
|
|
|
// R21C-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]], // Rule ID 21 //
|
|
|
|
//
|
2018-06-16 07:13:43 +08:00
|
|
|
// R21O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R21O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-22 12:31:50 +08:00
|
|
|
// R21N-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/4,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R21N-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_SELECT,
|
|
|
|
// R21N-NEXT: // MIs[0] dst
|
2018-05-23 10:04:19 +08:00
|
|
|
// R21N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R21N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R21N-NEXT: // MIs[0] src1
|
2018-05-24 03:16:59 +08:00
|
|
|
// R21N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R21N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R21N-NEXT: // MIs[0] src2
|
2018-05-24 03:16:59 +08:00
|
|
|
// R21N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
2020-06-21 09:38:43 +08:00
|
|
|
// R21O-NEXT: GIM_CheckCxxInsnPredicate, /*MI*/0, /*FnId*/GIPFP_MI_Predicate_frag,
|
2018-05-24 03:16:59 +08:00
|
|
|
// R21C-NEXT: GIM_CheckComplexPattern, /*MI*/0, /*Op*/2, /*Renderer*/0, GICP_gi_complex,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R21N-NEXT: // MIs[0] src3
|
2018-05-24 03:16:59 +08:00
|
|
|
// R21N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/3, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R21C-NEXT: GIM_CheckComplexPattern, /*MI*/0, /*Op*/3, /*Renderer*/1, GICP_gi_complex,
|
2020-06-21 09:38:43 +08:00
|
|
|
// R21N-NEXT: GIM_CheckCxxInsnPredicate, /*MI*/0, /*FnId*/GIPFP_MI_Predicate_frag,
|
2018-06-16 07:13:43 +08:00
|
|
|
// R21C-NEXT: // (select:{ *:[i32] } GPR32:{ *:[i32] }:$src1, complex:{ *:[i32] }:$src2, complex:{ *:[i32] }:$src3)<<P:Predicate_frag>> => (INSN2:{ *:[i32] } GPR32:{ *:[i32] }:$src1, complex:{ *:[i32] }:$src3, complex:{ *:[i32] }:$src2)
|
|
|
|
|
2018-05-22 07:28:51 +08:00
|
|
|
// R21C-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::INSN2,
|
|
|
|
// R21C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// R21C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src1
|
|
|
|
// R21C-NEXT: GIR_ComplexRenderer, /*InsnID*/0, /*RendererID*/1,
|
|
|
|
// R21C-NEXT: GIR_ComplexRenderer, /*InsnID*/0, /*RendererID*/0,
|
|
|
|
// R21C-NEXT: GIR_MergeMemOperands, /*InsnID*/0, /*MergeInsnID's*/0, GIU_MergeMemOperands_EndOfList,
|
|
|
|
// R21C-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// R21C-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// R21C-NEXT: // GIR_Coverage, 21,
|
|
|
|
// R21C-NEXT: GIR_Done,
|
|
|
|
// R21C-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
|
|
|
//
|
[GlobalISel][InstructionSelect] Switching MatchTable over opcodes, perf patch 4
This patch continues a series of patches started by r332907 (reapplied
as r332917)
In this commit we introduce a new matching opcode GIM_SwitchOpcode
that implements a jump table over opcodes and start emitting them for
root instructions.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 20% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64.
To some degree, we assume here that the opcodes form a dense set,
which is true at the moment for all upstream targets given the
limitations of our rule importing mechanism.
It might not be true for out of tree targets, specifically due to
pseudo's. If so, we might noticeably increase the size of the
MatchTable with this patch due to padding zeros. This will be
addressed later.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333017
2018-05-23 03:37:59 +08:00
|
|
|
// R21O-NEXT: GIM_Reject,
|
2018-05-23 10:04:19 +08:00
|
|
|
// R21O-NEXT: // Label [[GROUP_NUM]]: @[[GROUP]]
|
|
|
|
// R21O-NEXT: GIM_Reject,
|
[GlobalISel][InstructionSelect] Switching MatchTable over opcodes, perf patch 4
This patch continues a series of patches started by r332907 (reapplied
as r332917)
In this commit we introduce a new matching opcode GIM_SwitchOpcode
that implements a jump table over opcodes and start emitting them for
root instructions.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 20% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64.
To some degree, we assume here that the opcodes form a dense set,
which is true at the moment for all upstream targets given the
limitations of our rule importing mechanism.
It might not be true for out of tree targets, specifically due to
pseudo's. If so, we might noticeably increase the size of the
MatchTable with this patch due to padding zeros. This will be
addressed later.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333017
2018-05-23 03:37:59 +08:00
|
|
|
// R21O: // Label [[DEFAULT_NUM]]: @[[DEFAULT]]
|
|
|
|
// R21O-NEXT: GIM_Reject,
|
|
|
|
// R21O-NEXT: };
|
2017-10-10 02:14:53 +08:00
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
//===- Test a pattern with ComplexPattern operands. -----------------------===//
|
2017-07-06 16:12:20 +08:00
|
|
|
//
|
2018-05-23 10:04:19 +08:00
|
|
|
// R20O-NEXT: GIM_SwitchOpcode, /*MI*/0, /*[*/{{[0-9]+}}, {{[0-9]+}}, /*)*//*default:*//*Label [[DEFAULT_NUM:[0-9]+]]*/ [[DEFAULT:[0-9]+]],
|
|
|
|
// R20O-NEXT: /*TargetOpcode::G_ADD*//*Label [[CASE_ADD_NUM:[0-9]+]]*/ [[CASE_ADD:[0-9]+]],
|
|
|
|
// R20O: /*TargetOpcode::G_SUB*//*Label [[CASE_SUB_NUM:[0-9]+]]*/ [[CASE_SUB:[0-9]+]],
|
|
|
|
// R20O: // Label [[CASE_ADD_NUM]]: @[[CASE_ADD]]
|
|
|
|
// R20O: // Label [[CASE_SUB_NUM]]: @[[CASE_SUB]]
|
|
|
|
// R20O-NEXT: GIM_Try, /*On fail goto*//*Label [[GROUP_NUM:[0-9]+]]*/ [[GROUP:[0-9]+]],
|
|
|
|
// R20O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
[GlobalISel][InstructionSelect] Maximizing # of Group's common conditions, perf patch 8
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we greedily stuff 2nd level GroupMatcher's common
conditions with as many predicates as possible. This is purely
post-processing and it doesn't change which rules are put into the
groups in the first place: that decision is made by looking at the
first common predicate only.
The compile time improvements are minor and well within error margin,
however, it's highly improbable that this transformation could
pessimize performance, thus I'm still committing it for potential
gains for targets not implementing GlobalISel yet and out of tree
targets.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333139
2018-05-24 06:50:53 +08:00
|
|
|
// R20O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// R20O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-24 07:58:10 +08:00
|
|
|
// R20O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// R20N: GIM_Try, /*On fail goto*//*Label [[PREV_NUM:[0-9]+]]*/ [[PREV:[0-9]+]], // Rule ID 21 //
|
|
|
|
// R20N: // Label [[PREV_NUM]]: @[[PREV]]
|
|
|
|
//
|
|
|
|
// R20C-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]], // Rule ID 20 //
|
|
|
|
//
|
|
|
|
// R20N-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
|
|
|
// R20N-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_SUB,
|
|
|
|
// R20N-NEXT: // MIs[0] dst
|
|
|
|
// R20N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// R20N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R20N-NEXT: // MIs[0] src1
|
2018-05-24 03:16:59 +08:00
|
|
|
// R20N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// R20N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-23 10:04:19 +08:00
|
|
|
// R20N-NEXT: // MIs[0] src2
|
[GlobalISel][InstructionSelect] Maximizing # of Group's common conditions, perf patch 8
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we greedily stuff 2nd level GroupMatcher's common
conditions with as many predicates as possible. This is purely
post-processing and it doesn't change which rules are put into the
groups in the first place: that decision is made by looking at the
first common predicate only.
The compile time improvements are minor and well within error margin,
however, it's highly improbable that this transformation could
pessimize performance, thus I'm still committing it for potential
gains for targets not implementing GlobalISel yet and out of tree
targets.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333139
2018-05-24 06:50:53 +08:00
|
|
|
// R20N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-23 10:04:19 +08:00
|
|
|
// R20O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-24 07:58:10 +08:00
|
|
|
// R20C-NEXT: GIM_CheckComplexPattern, /*MI*/0, /*Op*/2, /*Renderer*/0, GICP_gi_complex,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R20C-NEXT: // (sub:{ *:[i32] } GPR32:{ *:[i32] }:$src1, complex:{ *:[i32] }:$src2) => (INSN1:{ *:[i32] } GPR32:{ *:[i32] }:$src1, complex:{ *:[i32] }:$src2)
|
|
|
|
// R20C-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::INSN1,
|
|
|
|
// R20C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// R20C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src1
|
|
|
|
// R20C-NEXT: GIR_ComplexRenderer, /*InsnID*/0, /*RendererID*/0,
|
|
|
|
// R20C-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// R20C-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// R20C-NEXT: // GIR_Coverage, 20,
|
|
|
|
// R20C-NEXT: GIR_Done,
|
|
|
|
// R20C-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
|
|
|
// R20O: // Label [[GROUP_NUM]]: @[[GROUP]]
|
|
|
|
// R20O-NEXT: GIM_Reject,
|
|
|
|
// R20O: // Label [[DEFAULT_NUM]]: @[[DEFAULT]]
|
|
|
|
// R20O-NEXT: GIM_Reject,
|
|
|
|
// R20O-NEXT: };
|
[tablegen][globalisel] Add support for nested instruction matching.
Summary:
Lift the restrictions that prevented the tree walking introduced in the
previous change and add support for patterns like:
(G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3
Also adds support for G_SEXT and G_ZEXT to support these cases.
One particular aspect of this that I should draw attention to is that I've
tried to be overly conservative in determining the safety of matches that
involve non-adjacent instructions and multiple basic blocks. This is intended
to be used as a cheap initial check and we may add a more expensive check in
the future. The current rules are:
* Reject if any instruction may load/store (we'd need to check for intervening
memory operations.
* Reject if any instruction has implicit operands.
* Reject if any instruction has unmodelled side-effects.
See isObviouslySafeToFold().
Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka
Reviewed By: ab
Subscribers: igorb, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30539
llvm-svn: 299430
2017-04-04 21:25:23 +08:00
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
def INSN1 : I<(outs GPR32:$dst), (ins GPR32:$src1, complex:$src2), []>;
|
|
|
|
def : Pat<(sub GPR32:$src1, complex:$src2), (INSN1 GPR32:$src1, complex:$src2)>;
|
[tablegen][globalisel] Add support for nested instruction matching.
Summary:
Lift the restrictions that prevented the tree walking introduced in the
previous change and add support for patterns like:
(G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3
Also adds support for G_SEXT and G_ZEXT to support these cases.
One particular aspect of this that I should draw attention to is that I've
tried to be overly conservative in determining the safety of matches that
involve non-adjacent instructions and multiple basic blocks. This is intended
to be used as a cheap initial check and we may add a more expensive check in
the future. The current rules are:
* Reject if any instruction may load/store (we'd need to check for intervening
memory operations.
* Reject if any instruction has implicit operands.
* Reject if any instruction has unmodelled side-effects.
See isObviouslySafeToFold().
Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka
Reviewed By: ab
Subscribers: igorb, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30539
llvm-svn: 299430
2017-04-04 21:25:23 +08:00
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
//===- Test a pattern with multiple ComplexPattern operands. --------------===//
|
|
|
|
//
|
|
|
|
def : GINodeEquiv<G_SELECT, select>;
|
|
|
|
let mayLoad = 1 in {
|
|
|
|
def INSN2 : I<(outs GPR32:$dst), (ins GPR32Op:$src1, complex:$src2, complex:$src3), []>;
|
|
|
|
}
|
2018-06-16 07:13:43 +08:00
|
|
|
def frag : PatFrag<(ops node:$a, node:$b, node:$c),
|
|
|
|
(select node:$a, node:$b, node:$c),
|
|
|
|
[{ return true; // C++ code }]> {
|
|
|
|
let GISelPredicateCode = [{ return true; // C++ code }];
|
|
|
|
}
|
|
|
|
def : Pat<(frag GPR32:$src1, complex:$src2, complex:$src3),
|
2018-02-17 06:37:15 +08:00
|
|
|
(INSN2 GPR32:$src1, complex:$src3, complex:$src2)>;
|
[globalisel] Decouple src pattern operands from dst pattern operands.
Summary:
This isn't testable for AArch64 by itself so this patch also adds
support for constant immediates in the pattern and physical
register uses in the result.
The new IntOperandMatcher matches the constant in patterns such as
'(set $rd:GPR32, (G_XOR $rs:GPR32, -1))'. It's always safe to fold
immediates into an instruction so this is the first rule that will match
across multiple BB's.
The Renderer hierarchy is responsible for adding operands to the result
instruction. Renderers can copy operands (CopyRenderer) or add physical
registers (in particular %wzr and %xzr) to the result instruction
in any order (OperandMatchers now import the operand names from
SelectionDAG to allow renderers to access any operand). This allows us to
emit the result instruction for:
%1 = G_XOR %0, -1 --> %1 = ORNWrr %wzr, %0
%1 = G_XOR -1, %0 --> %1 = ORNWrr %wzr, %0
although the latter is untested since the matcher/importer has not been
taught about commutativity yet.
Added BuildMIAction which can build new instructions and mutate them where
possible. W.r.t the mutation aspect, MatchActions are now told the name of
an instruction they can recycle and BuildMIAction will emit mutation code
when the renderers are appropriate. They are appropriate when all operands
are rendered using CopyRenderer and the indices are the same as the matcher.
This currently assumes that all operands have at least one matcher.
Finally, this change also fixes a crash in
AArch64InstructionSelector::select() caused by an immediate operand
passing isImm() rather than isCImm(). This was uncovered by the other
changes and was detected by existing tests.
Depends on D29711
Reviewers: t.p.northover, ab, qcolombet, rovka, aditya_nandakumar, javed.absar
Reviewed By: rovka
Subscribers: aemerson, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D29712
llvm-svn: 296131
2017-02-24 23:43:30 +08:00
|
|
|
|
[globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.
The following patches will expand on this further to fully fix the regressions.
Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33758
llvm-svn: 307079
2017-07-04 22:35:06 +08:00
|
|
|
//===- Test a more complex multi-instruction match. -----------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
2018-05-23 10:04:19 +08:00
|
|
|
// R00O-NEXT: GIM_SwitchOpcode, /*MI*/0, /*[*/{{[0-9]+}}, {{[0-9]+}}, /*)*//*default:*//*Label [[DEFAULT_NUM:[0-9]+]]*/ [[DEFAULT:[0-9]+]],
|
|
|
|
// R00O-NEXT: /*TargetOpcode::G_ADD*//*Label [[CASE_ADD_NUM:[0-9]+]]*/ [[CASE_ADD:[0-9]+]],
|
|
|
|
// R00O: /*TargetOpcode::G_SUB*//*Label [[CASE_SUB_NUM:[0-9]+]]*/ [[CASE_SUB:[0-9]+]],
|
|
|
|
// R00O: // Label [[CASE_ADD_NUM]]: @[[CASE_ADD]]
|
|
|
|
// R00O: // Label [[CASE_SUB_NUM]]: @[[CASE_SUB]]
|
|
|
|
// R00O-NEXT: GIM_Try, /*On fail goto*//*Label [[GROUP_NUM:[0-9]+]]*/ [[GROUP:[0-9]+]],
|
|
|
|
// R00O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
[GlobalISel][InstructionSelect] Maximizing # of Group's common conditions, perf patch 8
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we greedily stuff 2nd level GroupMatcher's common
conditions with as many predicates as possible. This is purely
post-processing and it doesn't change which rules are put into the
groups in the first place: that decision is made by looking at the
first common predicate only.
The compile time improvements are minor and well within error margin,
however, it's highly improbable that this transformation could
pessimize performance, thus I'm still committing it for potential
gains for targets not implementing GlobalISel yet and out of tree
targets.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333139
2018-05-24 06:50:53 +08:00
|
|
|
// R00O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// R00O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-24 07:58:10 +08:00
|
|
|
// R00O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// R00C: GIM_Try, /*On fail goto*//*Label [[PREV_NUM:[0-9]+]]*/ [[PREV:[0-9]+]], // Rule ID 20 //
|
|
|
|
// R00C: // Label [[PREV_NUM]]: @[[PREV]]
|
|
|
|
//
|
|
|
|
// R00C-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]], // Rule ID 0 //
|
|
|
|
// R00C-NEXT: GIM_CheckFeatures, GIFBS_HasA,
|
|
|
|
// R00N-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
|
|
|
// R00N-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_SUB,
|
|
|
|
// R00N-NEXT: // MIs[0] dst
|
|
|
|
// R00N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// R00N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-24 03:16:59 +08:00
|
|
|
// R00N-NEXT: // MIs[0] Operand 1
|
[GlobalISel][InstructionSelect] Maximizing # of Group's common conditions, perf patch 8
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we greedily stuff 2nd level GroupMatcher's common
conditions with as many predicates as possible. This is purely
post-processing and it doesn't change which rules are put into the
groups in the first place: that decision is made by looking at the
first common predicate only.
The compile time improvements are minor and well within error margin,
however, it's highly improbable that this transformation could
pessimize performance, thus I'm still committing it for potential
gains for targets not implementing GlobalISel yet and out of tree
targets.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333139
2018-05-24 06:50:53 +08:00
|
|
|
// R00N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R00C-NEXT: GIM_RecordInsn, /*DefineMI*/1, /*MI*/0, /*OpIdx*/1, // MIs[1]
|
|
|
|
// R00N-NEXT: GIM_CheckNumOperands, /*MI*/1, /*Expected*/3,
|
|
|
|
// R00C-NEXT: GIM_CheckOpcode, /*MI*/1, TargetOpcode::G_SUB,
|
|
|
|
// R00N-NEXT: // MIs[1] Operand 0
|
|
|
|
// R00N-NEXT: GIM_CheckType, /*MI*/1, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// R00N-NEXT: // MIs[1] src1
|
|
|
|
// R00C-NEXT: GIM_CheckType, /*MI*/1, /*Op*/1, /*Type*/GILLT_s32,
|
2018-05-23 10:04:19 +08:00
|
|
|
// R00O-NEXT: GIM_CheckType, /*MI*/1, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R00N-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R00N-NEXT: // MIs[1] src2
|
|
|
|
// R00N-NEXT: GIM_CheckType, /*MI*/1, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// R00N-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-24 03:16:59 +08:00
|
|
|
// R00N-NEXT: // MIs[0] Operand 2
|
|
|
|
// R00N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-24 07:58:10 +08:00
|
|
|
// R00O-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R00O-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R00C-NEXT: GIM_RecordInsn, /*DefineMI*/2, /*MI*/0, /*OpIdx*/2, // MIs[2]
|
|
|
|
// R00N-NEXT: GIM_CheckNumOperands, /*MI*/2, /*Expected*/3,
|
|
|
|
// R00C-NEXT: GIM_CheckOpcode, /*MI*/2, TargetOpcode::G_SUB,
|
|
|
|
// R00N-NEXT: // MIs[2] Operand 0
|
|
|
|
// R00N-NEXT: GIM_CheckType, /*MI*/2, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// R00N-NEXT: // MIs[2] src3
|
|
|
|
// R00C-NEXT: GIM_CheckType, /*MI*/2, /*Op*/1, /*Type*/GILLT_s32,
|
2018-05-23 10:04:19 +08:00
|
|
|
// R00O-NEXT: GIM_CheckType, /*MI*/2, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R00N-NEXT: GIM_CheckRegBankForClass, /*MI*/2, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R00N-NEXT: // MIs[2] src4
|
|
|
|
// R00N-NEXT: GIM_CheckType, /*MI*/2, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// R00N-NEXT: GIM_CheckRegBankForClass, /*MI*/2, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-23 10:04:19 +08:00
|
|
|
// R00O-NEXT: GIM_CheckRegBankForClass, /*MI*/2, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R00O-NEXT: GIM_CheckRegBankForClass, /*MI*/2, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-24 07:58:10 +08:00
|
|
|
// R00C-NEXT: GIM_CheckIsSafeToFold, /*InsnID*/1,
|
|
|
|
// R00C-NEXT: GIM_CheckIsSafeToFold, /*InsnID*/2,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R00C-NEXT: // (sub:{ *:[i32] } (sub:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2), (sub:{ *:[i32] } GPR32:{ *:[i32] }:$src3, GPR32:{ *:[i32] }:$src4)) => (INSNBOB:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2, GPR32:{ *:[i32] }:$src3, GPR32:{ *:[i32] }:$src4)
|
|
|
|
// R00C-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::INSNBOB,
|
|
|
|
// R00C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// R00C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/1, /*OpIdx*/1, // src1
|
|
|
|
// R00C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/1, /*OpIdx*/2, // src2
|
|
|
|
// R00C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/2, /*OpIdx*/1, // src3
|
|
|
|
// R00C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/2, /*OpIdx*/2, // src4
|
|
|
|
// R00C-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// R00C-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// R00C-NEXT: // GIR_Coverage, 0,
|
|
|
|
// R00C-NEXT: GIR_Done,
|
|
|
|
// R00C-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
|
|
|
// R00O-NEXT: GIM_Reject,
|
|
|
|
// R00O-NEXT: // Label [[GROUP_NUM]]: @[[GROUP]]
|
|
|
|
// R00O-NEXT: GIM_Reject,
|
|
|
|
// R00O: // Label [[DEFAULT_NUM]]: @[[DEFAULT]]
|
|
|
|
// R00O-NEXT: GIM_Reject,
|
|
|
|
// R00O-NEXT: };
|
[globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.
The following patches will expand on this further to fully fix the regressions.
Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33758
llvm-svn: 307079
2017-07-04 22:35:06 +08:00
|
|
|
|
|
|
|
def INSNBOB : I<(outs GPR32:$dst), (ins GPR32:$src1, GPR32:$src2, GPR32:$src3, GPR32:$src4),
|
|
|
|
[(set GPR32:$dst,
|
|
|
|
(sub (sub GPR32:$src1, GPR32:$src2), (sub GPR32:$src3, GPR32:$src4)))]>,
|
|
|
|
Requires<[HasA]>;
|
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
//===- Test a simple pattern with an intrinsic. ---------------------------===//
|
[globalisel][tablegen] Fix patterns involving multiple ComplexPatterns.
Summary:
Temporaries are now allocated to operands instead of predicates and this
allocation is used to correctly pair up the rendered operands with the
matched operands.
Previously, ComplexPatterns were allocated temporaries independently in the
Src Pattern and Dst Pattern, leading to mismatches. Additionally, the Dst
Pattern failed to account for the allocated index and therefore always used
temporary 0, 1, ... when it should have used base+0, base+1, ...
Thanks to Aditya Nandakumar for noticing the bug.
Depends on D30539
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: igorb, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D31054
llvm-svn: 299538
2017-04-05 21:14:03 +08:00
|
|
|
//
|
2018-05-23 10:04:19 +08:00
|
|
|
// R01O-NEXT: GIM_SwitchOpcode, /*MI*/0, /*[*/{{[0-9]+}}, {{[0-9]+}}, /*)*//*default:*//*Label [[DEFAULT_NUM:[0-9]+]]*/ [[DEFAULT:[0-9]+]],
|
|
|
|
// R01O-NEXT: /*TargetOpcode::G_ADD*//*Label [[CASE_ADD_NUM:[0-9]+]]*/ [[CASE_ADD:[0-9]+]],
|
|
|
|
// R01O: /*TargetOpcode::G_INTRINSIC*//*Label [[CASE_INTRINSIC_NUM:[0-9]+]]*/ [[CASE_INTRINSIC:[0-9]+]],
|
|
|
|
// R01O: // Label [[CASE_ADD_NUM]]: @[[CASE_ADD]]
|
|
|
|
// R01O: // Label [[CASE_INTRINSIC_NUM]]: @[[CASE_INTRINSIC]]
|
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// R01N: GIM_Try, /*On fail goto*//*Label [[PREV_NUM:[0-9]+]]*/ [[PREV:[0-9]+]], // Rule ID 0 //
|
|
|
|
// R01N: // Label [[PREV_NUM]]: @[[PREV]]
|
|
|
|
//
|
|
|
|
// R01C-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]], // Rule ID 1 //
|
|
|
|
// R01C-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
|
|
|
//
|
2018-05-23 10:04:19 +08:00
|
|
|
// R01O-NEXT: GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::mytarget_nop,
|
|
|
|
// R01O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// R01O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// R01O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// R01N-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_INTRINSIC,
|
|
|
|
// R01N-NEXT: // MIs[0] dst
|
|
|
|
// R01N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// R01N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R01N-NEXT: // MIs[0] Operand 1
|
|
|
|
// R01N-NEXT: GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::mytarget_nop,
|
|
|
|
// R01N-NEXT: // MIs[0] src1
|
|
|
|
// R01N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
//
|
|
|
|
// R01C-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R01C-NEXT: // (intrinsic_wo_chain:{ *:[i32] } [[ID:[0-9]+]]:{ *:[iPTR] }, GPR32:{ *:[i32] }:$src1) => (MOV:{ *:[i32] } GPR32:{ *:[i32] }:$src1)
|
|
|
|
// R01C-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MOV,
|
|
|
|
// R01C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// R01C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // src1
|
|
|
|
// R01C-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// R01C-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// R01C-NEXT: // GIR_Coverage, 1,
|
|
|
|
// R01C-NEXT: GIR_Done,
|
|
|
|
// R01C-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
|
|
|
// R01O-NEXT: GIM_Reject,
|
|
|
|
// R01O: // Label [[DEFAULT_NUM]]: @[[DEFAULT]]
|
|
|
|
// R01O-NEXT: GIM_Reject,
|
[globalisel][tablegen] Fix patterns involving multiple ComplexPatterns.
Summary:
Temporaries are now allocated to operands instead of predicates and this
allocation is used to correctly pair up the rendered operands with the
matched operands.
Previously, ComplexPatterns were allocated temporaries independently in the
Src Pattern and Dst Pattern, leading to mismatches. Additionally, the Dst
Pattern failed to account for the allocated index and therefore always used
temporary 0, 1, ... when it should have used base+0, base+1, ...
Thanks to Aditya Nandakumar for noticing the bug.
Depends on D30539
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: igorb, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D31054
llvm-svn: 299538
2017-04-05 21:14:03 +08:00
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
def MOV : I<(outs GPR32:$dst), (ins GPR32:$src1),
|
|
|
|
[(set GPR32:$dst, (int_mytarget_nop GPR32:$src1))]>;
|
|
|
|
|
[globalisel][tablegen] Add experimental support for OperandWithDefaultOps, PredicateOperand, and OptionalDefOperand
Summary:
As far as instruction selection is concerned, all three appear to be same thing.
Support for these operands is experimental since AArch64 doesn't make use
of them and the in-tree targets that do use them (AMDGPU for
OperandWithDefaultOps, AMDGPU/ARM/Hexagon/Lanai for PredicateOperand, and ARM
for OperandWithDefaultOps) are not using tablegen-erated GlobalISel yet.
Reviewers: rovka, aditya_nandakumar, t.p.northover, qcolombet, ab
Reviewed By: rovka
Subscribers: inglorion, aemerson, rengolin, mehdi_amini, dberris, kristof.beyls, igorb, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D31135
llvm-svn: 300037
2017-04-12 16:23:08 +08:00
|
|
|
//===- Test a simple pattern with a default operand. ----------------------===//
|
|
|
|
//
|
2018-05-23 10:04:19 +08:00
|
|
|
// R02O-NEXT: GIM_SwitchOpcode, /*MI*/0, /*[*/{{[0-9]+}}, {{[0-9]+}}, /*)*//*default:*//*Label [[DEFAULT_NUM:[0-9]+]]*/ [[DEFAULT:[0-9]+]],
|
|
|
|
// R02O-NEXT: /*TargetOpcode::G_ADD*//*Label [[CASE_ADD_NUM:[0-9]+]]*/ [[CASE_ADD:[0-9]+]],
|
|
|
|
// R02O: /*TargetOpcode::G_XOR*//*Label [[CASE_XOR_NUM:[0-9]+]]*/ [[CASE_XOR:[0-9]+]],
|
|
|
|
// R02O: // Label [[CASE_ADD_NUM]]: @[[CASE_ADD]]
|
|
|
|
// R02O: // Label [[CASE_XOR_NUM]]: @[[CASE_XOR]]
|
|
|
|
// R02O-NEXT: GIM_Try, /*On fail goto*//*Label [[GROUP_NUM:[0-9]+]]*/ [[GROUP:[0-9]+]],
|
|
|
|
// R02O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
[GlobalISel][InstructionSelect] Maximizing # of Group's common conditions, perf patch 8
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we greedily stuff 2nd level GroupMatcher's common
conditions with as many predicates as possible. This is purely
post-processing and it doesn't change which rules are put into the
groups in the first place: that decision is made by looking at the
first common predicate only.
The compile time improvements are minor and well within error margin,
however, it's highly improbable that this transformation could
pessimize performance, thus I'm still committing it for potential
gains for targets not implementing GlobalISel yet and out of tree
targets.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333139
2018-05-24 06:50:53 +08:00
|
|
|
// R02O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// R02O-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
2018-05-24 07:58:10 +08:00
|
|
|
// R02O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R02O-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// R02N: GIM_Try, /*On fail goto*//*Label [[PREV_NUM:[0-9]+]]*/ [[PREV:[0-9]+]], // Rule ID 1 //
|
|
|
|
// R02N: // Label [[PREV_NUM]]: @[[PREV]]
|
|
|
|
//
|
|
|
|
// R02C-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]], // Rule ID 2 //
|
|
|
|
//
|
|
|
|
// R02N-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
|
|
|
// R02N-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_XOR,
|
|
|
|
// R02N-NEXT: // MIs[0] dst
|
|
|
|
// R02N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// R02N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// R02N-NEXT: // MIs[0] src1
|
2018-05-24 03:16:59 +08:00
|
|
|
// R02N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
2018-05-22 07:28:51 +08:00
|
|
|
// R02N-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
2018-05-24 03:16:59 +08:00
|
|
|
// R02N-NEXT: // MIs[0] Operand 2
|
[GlobalISel][InstructionSelect] Maximizing # of Group's common conditions, perf patch 8
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we greedily stuff 2nd level GroupMatcher's common
conditions with as many predicates as possible. This is purely
post-processing and it doesn't change which rules are put into the
groups in the first place: that decision is made by looking at the
first common predicate only.
The compile time improvements are minor and well within error margin,
however, it's highly improbable that this transformation could
pessimize performance, thus I'm still committing it for potential
gains for targets not implementing GlobalISel yet and out of tree
targets.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333139
2018-05-24 06:50:53 +08:00
|
|
|
// R02N-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// R02C-NEXT: GIM_CheckConstantInt, /*MI*/0, /*Op*/2, -2
|
|
|
|
// R02C-NEXT: // (xor:{ *:[i32] } GPR32:{ *:[i32] }:$src1, -2:{ *:[i32] }) => (XORI:{ *:[i32] } GPR32:{ *:[i32] }:$src1)
|
|
|
|
// R02C-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::XORI,
|
|
|
|
// R02C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// R02C-NEXT: GIR_AddImm, /*InsnID*/0, /*Imm*/-1,
|
|
|
|
// R02C-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src1
|
|
|
|
// R02C-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// R02C-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// R02C-NEXT: // GIR_Coverage, 2,
|
|
|
|
// R02C-NEXT: GIR_Done,
|
|
|
|
// R02C-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-05-23 10:04:19 +08:00
|
|
|
//
|
|
|
|
// R02O: // Label [[DEFAULT_NUM]]: @[[DEFAULT]]
|
|
|
|
// R02O-NEXT: GIM_Reject,
|
[globalisel][tablegen] Add experimental support for OperandWithDefaultOps, PredicateOperand, and OptionalDefOperand
Summary:
As far as instruction selection is concerned, all three appear to be same thing.
Support for these operands is experimental since AArch64 doesn't make use
of them and the in-tree targets that do use them (AMDGPU for
OperandWithDefaultOps, AMDGPU/ARM/Hexagon/Lanai for PredicateOperand, and ARM
for OperandWithDefaultOps) are not using tablegen-erated GlobalISel yet.
Reviewers: rovka, aditya_nandakumar, t.p.northover, qcolombet, ab
Reviewed By: rovka
Subscribers: inglorion, aemerson, rengolin, mehdi_amini, dberris, kristof.beyls, igorb, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D31135
llvm-svn: 300037
2017-04-12 16:23:08 +08:00
|
|
|
|
|
|
|
// The -2 is just to distinguish it from the 'not' case below.
|
|
|
|
def XORI : I<(outs GPR32:$dst), (ins m1:$src2, GPR32:$src1),
|
|
|
|
[(set GPR32:$dst, (xor GPR32:$src1, -2))]>;
|
|
|
|
|
|
|
|
//===- Test a simple pattern with a default register operand. -------------===//
|
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_XOR,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckConstantInt, /*MI*/0, /*Op*/2, -3
|
|
|
|
// NOOPT-NEXT: // (xor:{ *:[i32] } GPR32:{ *:[i32] }:$src1, -3:{ *:[i32] }) => (XOR:{ *:[i32] } GPR32:{ *:[i32] }:$src1)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::XOR,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_AddRegister, /*InsnID*/0, MyTarget::R0,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src1
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 3,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
[globalisel][tablegen] Add experimental support for OperandWithDefaultOps, PredicateOperand, and OptionalDefOperand
Summary:
As far as instruction selection is concerned, all three appear to be same thing.
Support for these operands is experimental since AArch64 doesn't make use
of them and the in-tree targets that do use them (AMDGPU for
OperandWithDefaultOps, AMDGPU/ARM/Hexagon/Lanai for PredicateOperand, and ARM
for OperandWithDefaultOps) are not using tablegen-erated GlobalISel yet.
Reviewers: rovka, aditya_nandakumar, t.p.northover, qcolombet, ab
Reviewed By: rovka
Subscribers: inglorion, aemerson, rengolin, mehdi_amini, dberris, kristof.beyls, igorb, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D31135
llvm-svn: 300037
2017-04-12 16:23:08 +08:00
|
|
|
|
|
|
|
// The -3 is just to distinguish it from the 'not' case below and the other default op case above.
|
|
|
|
def XOR : I<(outs GPR32:$dst), (ins Z:$src2, GPR32:$src1),
|
|
|
|
[(set GPR32:$dst, (xor GPR32:$src1, -3))]>;
|
|
|
|
|
|
|
|
//===- Test a simple pattern with a multiple default operands. ------------===//
|
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_XOR,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckConstantInt, /*MI*/0, /*Op*/2, -4
|
|
|
|
// NOOPT-NEXT: // (xor:{ *:[i32] } GPR32:{ *:[i32] }:$src1, -4:{ *:[i32] }) => (XORlike:{ *:[i32] } GPR32:{ *:[i32] }:$src1)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::XORlike,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_AddImm, /*InsnID*/0, /*Imm*/-1,
|
|
|
|
// NOOPT-NEXT: GIR_AddRegister, /*InsnID*/0, MyTarget::R0,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src1
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 4,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
[globalisel][tablegen] Add experimental support for OperandWithDefaultOps, PredicateOperand, and OptionalDefOperand
Summary:
As far as instruction selection is concerned, all three appear to be same thing.
Support for these operands is experimental since AArch64 doesn't make use
of them and the in-tree targets that do use them (AMDGPU for
OperandWithDefaultOps, AMDGPU/ARM/Hexagon/Lanai for PredicateOperand, and ARM
for OperandWithDefaultOps) are not using tablegen-erated GlobalISel yet.
Reviewers: rovka, aditya_nandakumar, t.p.northover, qcolombet, ab
Reviewed By: rovka
Subscribers: inglorion, aemerson, rengolin, mehdi_amini, dberris, kristof.beyls, igorb, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D31135
llvm-svn: 300037
2017-04-12 16:23:08 +08:00
|
|
|
|
|
|
|
// The -4 is just to distinguish it from the other 'not' cases.
|
|
|
|
def XORlike : I<(outs GPR32:$dst), (ins m1Z:$src2, GPR32:$src1),
|
|
|
|
[(set GPR32:$dst, (xor GPR32:$src1, -4))]>;
|
|
|
|
|
2017-05-17 16:57:28 +08:00
|
|
|
//===- Test a simple pattern with multiple operands with defaults. --------===//
|
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_XOR,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckConstantInt, /*MI*/0, /*Op*/2, -5,
|
|
|
|
// NOOPT-NEXT: // (xor:{ *:[i32] } GPR32:{ *:[i32] }:$src1, -5:{ *:[i32] }) => (XORManyDefaults:{ *:[i32] } GPR32:{ *:[i32] }:$src1)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::XORManyDefaults,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_AddImm, /*InsnID*/0, /*Imm*/-1,
|
|
|
|
// NOOPT-NEXT: GIR_AddRegister, /*InsnID*/0, MyTarget::R0,
|
|
|
|
// NOOPT-NEXT: GIR_AddRegister, /*InsnID*/0, MyTarget::R0,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src1
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 5,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2017-05-17 16:57:28 +08:00
|
|
|
|
|
|
|
// The -5 is just to distinguish it from the other cases.
|
|
|
|
def XORManyDefaults : I<(outs GPR32:$dst), (ins m1Z:$src3, Z:$src2, GPR32:$src1),
|
|
|
|
[(set GPR32:$dst, (xor GPR32:$src1, -5))]>;
|
|
|
|
|
[globalisel] Decouple src pattern operands from dst pattern operands.
Summary:
This isn't testable for AArch64 by itself so this patch also adds
support for constant immediates in the pattern and physical
register uses in the result.
The new IntOperandMatcher matches the constant in patterns such as
'(set $rd:GPR32, (G_XOR $rs:GPR32, -1))'. It's always safe to fold
immediates into an instruction so this is the first rule that will match
across multiple BB's.
The Renderer hierarchy is responsible for adding operands to the result
instruction. Renderers can copy operands (CopyRenderer) or add physical
registers (in particular %wzr and %xzr) to the result instruction
in any order (OperandMatchers now import the operand names from
SelectionDAG to allow renderers to access any operand). This allows us to
emit the result instruction for:
%1 = G_XOR %0, -1 --> %1 = ORNWrr %wzr, %0
%1 = G_XOR -1, %0 --> %1 = ORNWrr %wzr, %0
although the latter is untested since the matcher/importer has not been
taught about commutativity yet.
Added BuildMIAction which can build new instructions and mutate them where
possible. W.r.t the mutation aspect, MatchActions are now told the name of
an instruction they can recycle and BuildMIAction will emit mutation code
when the renderers are appropriate. They are appropriate when all operands
are rendered using CopyRenderer and the indices are the same as the matcher.
This currently assumes that all operands have at least one matcher.
Finally, this change also fixes a crash in
AArch64InstructionSelector::select() caused by an immediate operand
passing isImm() rather than isCImm(). This was uncovered by the other
changes and was detected by existing tests.
Depends on D29711
Reviewers: t.p.northover, ab, qcolombet, rovka, aditya_nandakumar, javed.absar
Reviewed By: rovka
Subscribers: aemerson, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D29712
llvm-svn: 296131
2017-02-24 23:43:30 +08:00
|
|
|
//===- Test a simple pattern with constant immediate operands. ------------===//
|
|
|
|
//
|
|
|
|
// This must precede the 3-register variants because constant immediates have
|
|
|
|
// priority over register banks.
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_XOR,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Wm
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckConstantInt, /*MI*/0, /*Op*/2, -1,
|
|
|
|
// NOOPT-NEXT: // (xor:{ *:[i32] } GPR32:{ *:[i32] }:$Wm, -1:{ *:[i32] }) => (ORN:{ *:[i32] } R0:{ *:[i32] }, GPR32:{ *:[i32] }:$Wm)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::ORN,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_AddRegister, /*InsnID*/0, MyTarget::R0,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // Wm
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 22,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
[globalisel] Decouple src pattern operands from dst pattern operands.
Summary:
This isn't testable for AArch64 by itself so this patch also adds
support for constant immediates in the pattern and physical
register uses in the result.
The new IntOperandMatcher matches the constant in patterns such as
'(set $rd:GPR32, (G_XOR $rs:GPR32, -1))'. It's always safe to fold
immediates into an instruction so this is the first rule that will match
across multiple BB's.
The Renderer hierarchy is responsible for adding operands to the result
instruction. Renderers can copy operands (CopyRenderer) or add physical
registers (in particular %wzr and %xzr) to the result instruction
in any order (OperandMatchers now import the operand names from
SelectionDAG to allow renderers to access any operand). This allows us to
emit the result instruction for:
%1 = G_XOR %0, -1 --> %1 = ORNWrr %wzr, %0
%1 = G_XOR -1, %0 --> %1 = ORNWrr %wzr, %0
although the latter is untested since the matcher/importer has not been
taught about commutativity yet.
Added BuildMIAction which can build new instructions and mutate them where
possible. W.r.t the mutation aspect, MatchActions are now told the name of
an instruction they can recycle and BuildMIAction will emit mutation code
when the renderers are appropriate. They are appropriate when all operands
are rendered using CopyRenderer and the indices are the same as the matcher.
This currently assumes that all operands have at least one matcher.
Finally, this change also fixes a crash in
AArch64InstructionSelector::select() caused by an immediate operand
passing isImm() rather than isCImm(). This was uncovered by the other
changes and was detected by existing tests.
Depends on D29711
Reviewers: t.p.northover, ab, qcolombet, rovka, aditya_nandakumar, javed.absar
Reviewed By: rovka
Subscribers: aemerson, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D29712
llvm-svn: 296131
2017-02-24 23:43:30 +08:00
|
|
|
|
|
|
|
def ORN : I<(outs GPR32:$dst), (ins GPR32:$src1, GPR32:$src2), []>;
|
|
|
|
def : Pat<(not GPR32:$Wm), (ORN R0, GPR32:$Wm)>;
|
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
//===- Test a nested instruction match. -----------------------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckFeatures, GIFBS_HasA,
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
2018-02-17 06:37:15 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_MUL,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_RecordInsn, /*DefineMI*/1, /*MI*/0, /*OpIdx*/1, // MIs[1]
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/1, /*Expected*/3,
|
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/1, TargetOpcode::G_ADD,
|
|
|
|
// NOOPT-NEXT: // MIs[1] Operand 0
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/1, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: // MIs[1] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/1, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[1] src2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/1, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src3
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: GIM_CheckIsSafeToFold, /*InsnID*/1,
|
|
|
|
// NOOPT-NEXT: // (mul:{ *:[i32] } (add:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2), GPR32:{ *:[i32] }:$src3) => (MULADD:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2, GPR32:{ *:[i32] }:$src3)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MULADD,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/1, /*OpIdx*/1, // src1
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/1, /*OpIdx*/2, // src2
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // src3
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 6,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-02-17 06:37:15 +08:00
|
|
|
|
|
|
|
// We also get a second rule by commutativity.
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckFeatures, GIFBS_HasA,
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
2018-02-17 06:37:15 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_MUL,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src3
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_RecordInsn, /*DefineMI*/1, /*MI*/0, /*OpIdx*/2,
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/1, /*Expected*/3,
|
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/1, TargetOpcode::G_ADD,
|
|
|
|
// NOOPT-NEXT: // MIs[1] Operand 0
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/1, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: // MIs[1] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/1, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[1] src2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/1, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/1, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: GIM_CheckIsSafeToFold, /*InsnID*/1,
|
|
|
|
// NOOPT-NEXT: // (mul:{ *:[i32] } GPR32:{ *:[i32] }:$src3, (add:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2)) => (MULADD:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2, GPR32:{ *:[i32] }:$src3)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MULADD,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/1, /*OpIdx*/1, // src1
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/1, /*OpIdx*/2, // src2
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src3
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
Add support for pointer types in patterns
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
2019-02-21 03:43:47 +08:00
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 26,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2017-06-20 20:36:34 +08:00
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
def MULADD : I<(outs GPR32:$dst), (ins GPR32:$src1, GPR32:$src2, GPR32:$src3),
|
|
|
|
[(set GPR32:$dst,
|
|
|
|
(mul (add GPR32:$src1, GPR32:$src2), GPR32:$src3))]>,
|
|
|
|
Requires<[HasA]>;
|
2017-06-20 20:36:34 +08:00
|
|
|
|
2017-08-08 18:44:31 +08:00
|
|
|
//===- Test a simple pattern with just a specific leaf immediate. ---------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_CONSTANT,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 1
|
|
|
|
// NOOPT-NEXT: GIM_CheckLiteralInt, /*MI*/0, /*Op*/1, 1,
|
|
|
|
// NOOPT-NEXT: // 1:{ *:[i32] } => (MOV1:{ *:[i32] })
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MOV1,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 7,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2017-05-24 03:33:16 +08:00
|
|
|
|
|
|
|
def MOV1 : I<(outs GPR32:$dst), (ins), [(set GPR32:$dst, 1)]>;
|
|
|
|
|
2017-08-24 17:11:20 +08:00
|
|
|
//===- Test a simple pattern with a leaf immediate and a predicate. -------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_CONSTANT,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckI64ImmPredicate, /*MI*/0, /*Predicate*/GIPFP_I64_Predicate_simm8,
|
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 1
|
|
|
|
// NOOPT-NEXT: // No operand predicates
|
|
|
|
// NOOPT-NEXT: // (imm:{ *:[i32] })<<P:Predicate_simm8>>:$imm => (MOVimm8:{ *:[i32] } (imm:{ *:[i32] }):$imm)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MOVimm8,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_CopyConstantAsSImm, /*NewInsnID*/0, /*OldInsnID*/0, // imm
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 8,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2017-08-24 17:11:20 +08:00
|
|
|
|
|
|
|
def simm8 : ImmLeaf<i32, [{ return isInt<8>(Imm); }]>;
|
|
|
|
def MOVimm8 : I<(outs GPR32:$dst), (ins i32imm:$imm), [(set GPR32:$dst, simm8:$imm)]>;
|
|
|
|
|
[globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
2017-10-14 05:28:03 +08:00
|
|
|
//===- Same again but use an IntImmLeaf. ----------------------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_CONSTANT,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckAPIntImmPredicate, /*MI*/0, /*Predicate*/GIPFP_APInt_Predicate_simm9,
|
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 1
|
|
|
|
// NOOPT-NEXT: // No operand predicates
|
|
|
|
// NOOPT-NEXT: // (imm:{ *:[i32] })<<P:Predicate_simm9>>:$imm => (MOVimm9:{ *:[i32] } (imm:{ *:[i32] }):$imm)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MOVimm9,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_CopyConstantAsSImm, /*NewInsnID*/0, /*OldInsnID*/0, // imm
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 9,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
[globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
2017-10-14 05:28:03 +08:00
|
|
|
|
|
|
|
def simm9 : IntImmLeaf<i32, [{ return isInt<9>(Imm->getSExtValue()); }]>;
|
|
|
|
def MOVimm9 : I<(outs GPR32:$dst), (ins i32imm:$imm), [(set GPR32:$dst, simm9:$imm)]>;
|
|
|
|
|
2018-01-17 02:44:05 +08:00
|
|
|
//===- Test a pattern with a custom renderer. -----------------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
2018-01-17 02:44:05 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_CONSTANT,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckI64ImmPredicate, /*MI*/0, /*Predicate*/GIPFP_I64_Predicate_cimm8,
|
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 1
|
|
|
|
// NOOPT-NEXT: // No operand predicates
|
|
|
|
// NOOPT-NEXT: // (imm:{ *:[i32] })<<P:Predicate_cimm8>><<X:cimm8_xform>>:$imm => (MOVcimm8:{ *:[i32] } (cimm8_xform:{ *:[i32] } (imm:{ *:[i32] }):$imm))
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MOVcimm8,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
2020-08-27 23:15:33 +08:00
|
|
|
// NOOPT-NEXT: GIR_CustomRenderer, /*InsnID*/0, /*OldInsnID*/0, /*Renderer*/GICR_renderImm, // imm
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 10,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
|
|
|
|
2018-01-17 02:44:05 +08:00
|
|
|
def MOVcimm8 : I<(outs GPR32:$dst), (ins i32imm:$imm), [(set GPR32:$dst, cimm8:$imm)]>;
|
2017-08-08 18:44:31 +08:00
|
|
|
|
[globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
2017-10-14 05:28:03 +08:00
|
|
|
//===- Test a simple pattern with a FP immediate and a predicate. ---------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_FCONSTANT,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckAPFloatImmPredicate, /*MI*/0, /*Predicate*/GIPFP_APFloat_Predicate_fpimmz,
|
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::FPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 1
|
|
|
|
// NOOPT-NEXT: // No operand predicates
|
|
|
|
// NOOPT-NEXT: // (fpimm:{ *:[f32] })<<P:Predicate_fpimmz>>:$imm => (MOVfpimmz:{ *:[f32] } (fpimm:{ *:[f32] }):$imm)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MOVfpimmz,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_CopyFConstantAsFPImm, /*NewInsnID*/0, /*OldInsnID*/0, // imm
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 17,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
[globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
2017-10-14 05:28:03 +08:00
|
|
|
|
[globalisel][tablegen] Implement unindexed load, non-extending load, and MemVT checks
Summary:
This includes some context-sensitivity in the MVT to LLT conversion so that
pointer types are tested correctly.
FIXME: I'm not happy with the way this is done since everything is a
special-case. I've yet to find a reasonable way to implement it.
select-load.mir fails because <1 x s64> loads in tablegen get priority over s64
loads. This is fixed in the next patch and as such they should be committed
together, I've posted them separately to help with the review.
Depends on D37456
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37457
llvm-svn: 315884
2017-10-16 08:56:30 +08:00
|
|
|
//===- Test a simple pattern with inferred pointer operands. ---------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_LOAD,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckMemorySizeEqualToLLT, /*MI*/0, /*MMO*/0, /*OpIdx*/0,
|
|
|
|
// NOOPT-NEXT: GIM_CheckAtomicOrdering, /*MI*/0, /*Order*/(int64_t)AtomicOrdering::NotAtomic,
|
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckPointerToAny, /*MI*/0, /*Op*/1, /*SizeInBits*/32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // (ld:{ *:[i32] } GPR32:{ *:[i32] }:$src1)<<P:Predicate_unindexedload>><<P:Predicate_load>> => (LOAD:{ *:[i32] } GPR32:{ *:[i32] }:$src1)
|
|
|
|
// NOOPT-NEXT: GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/MyTarget::LOAD,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 11,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
[globalisel][tablegen] Implement unindexed load, non-extending load, and MemVT checks
Summary:
This includes some context-sensitivity in the MVT to LLT conversion so that
pointer types are tested correctly.
FIXME: I'm not happy with the way this is done since everything is a
special-case. I've yet to find a reasonable way to implement it.
select-load.mir fails because <1 x s64> loads in tablegen get priority over s64
loads. This is fixed in the next patch and as such they should be committed
together, I've posted them separately to help with the review.
Depends on D37456
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37457
llvm-svn: 315884
2017-10-16 08:56:30 +08:00
|
|
|
|
|
|
|
def LOAD : I<(outs GPR32:$dst), (ins GPR32:$src1),
|
|
|
|
[(set GPR32:$dst, (load GPR32:$src1))]>;
|
2017-11-11 11:23:44 +08:00
|
|
|
|
Add support for pointer types in patterns
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
2019-02-21 03:43:47 +08:00
|
|
|
//===- Test a simple pattern with explicit pointer operands. ---------------===//
|
|
|
|
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_LOAD,
|
|
|
|
// NOOPT-NEXT: GIM_CheckMemorySizeEqualToLLT, /*MI*/0, /*MMO*/0, /*OpIdx*/0,
|
|
|
|
// NOOPT-NEXT: GIM_CheckAtomicOrdering, /*MI*/0, /*Order*/(int64_t)AtomicOrdering::NotAtomic,
|
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_p0s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src
|
|
|
|
// NOOPT-NEXT: GIM_CheckPointerToAny, /*MI*/0, /*Op*/1, /*SizeInBits*/32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // (ld:{ *:[i32] } GPR32:{ *:[i32] }:$src)<<P:Predicate_unindexedload>><<P:Predicate_load>> => (LOAD:{ *:[i32] } GPR32:{ *:[i32] }:$src)
|
|
|
|
// NOOPT-NEXT: GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/MyTarget::LOAD,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 23,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
|
|
|
|
|
|
|
def : Pat<(load GPR32:$src),
|
|
|
|
(p0 (LOAD GPR32:$src))>;
|
|
|
|
|
[globalisel] Update GlobalISel emitter to match new representation of extending loads
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
improve optimization of extends and truncates, this legality requirement
would spread without considerable care w.r.t when certain combines were
permitted.
* The SelectionDAG importer required some ugly and fragile pattern
rewriting to translate patterns into this style.
This patch changes the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.
Each extending load can be lowered by the legalizer into separate extends
and loads, however a target that supports s1 will need the any-extending
load to extend to at least s8 since LLVM does not represent memory accesses
smaller than 8 bit. The legalizer can widenScalar G_LOAD into an
any-extending load but sign/zero-extending loads need help from something
else like a combiner pass. A follow-up patch that adds combiner helpers for
for this will follow.
The new representation requires that the MMO correctly reflect the memory
access so this has been corrected in a couple tests. I've also moved the
extending loads to their own tests since they are (mostly) separate opcodes
now. Additionally, the re-write appears to have invalidated two tests from
select-with-no-legality-check.mir since the matcher table no longer contains
loads that result in s1's and they aren't legal in AArch64 anymore.
Depends on D45540
Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar
Reviewed By: rtereshin
Subscribers: javed.absar, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D45541
llvm-svn: 331601
2018-05-06 04:53:24 +08:00
|
|
|
//===- Test a simple pattern with a sextload -------------------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
[globalisel] Update GlobalISel emitter to match new representation of extending loads
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
improve optimization of extends and truncates, this legality requirement
would spread without considerable care w.r.t when certain combines were
permitted.
* The SelectionDAG importer required some ugly and fragile pattern
rewriting to translate patterns into this style.
This patch changes the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.
Each extending load can be lowered by the legalizer into separate extends
and loads, however a target that supports s1 will need the any-extending
load to extend to at least s8 since LLVM does not represent memory accesses
smaller than 8 bit. The legalizer can widenScalar G_LOAD into an
any-extending load but sign/zero-extending loads need help from something
else like a combiner pass. A follow-up patch that adds combiner helpers for
for this will follow.
The new representation requires that the MMO correctly reflect the memory
access so this has been corrected in a couple tests. I've also moved the
extending loads to their own tests since they are (mostly) separate opcodes
now. Additionally, the re-write appears to have invalidated two tests from
select-with-no-legality-check.mir since the matcher table no longer contains
loads that result in s1's and they aren't legal in AArch64 anymore.
Depends on D45540
Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar
Reviewed By: rtereshin
Subscribers: javed.absar, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D45541
llvm-svn: 331601
2018-05-06 04:53:24 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_SEXTLOAD,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckMemorySizeEqualTo, /*MI*/0, /*MMO*/0, /*Size*/2,
|
|
|
|
// NOOPT-NEXT: GIM_CheckAtomicOrdering, /*MI*/0, /*Order*/(int64_t)AtomicOrdering::NotAtomic,
|
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckPointerToAny, /*MI*/0, /*Op*/1, /*SizeInBits*/32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // (ld:{ *:[i32] } GPR32:{ *:[i32] }:$src1)<<P:Predicate_unindexedload>><<P:Predicate_sextload>><<P:Predicate_sextloadi16>> => (SEXTLOAD:{ *:[i32] } GPR32:{ *:[i32] }:$src1)
|
|
|
|
// NOOPT-NEXT: GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/MyTarget::SEXTLOAD,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 12,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
[globalisel] Update GlobalISel emitter to match new representation of extending loads
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
improve optimization of extends and truncates, this legality requirement
would spread without considerable care w.r.t when certain combines were
permitted.
* The SelectionDAG importer required some ugly and fragile pattern
rewriting to translate patterns into this style.
This patch changes the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.
Each extending load can be lowered by the legalizer into separate extends
and loads, however a target that supports s1 will need the any-extending
load to extend to at least s8 since LLVM does not represent memory accesses
smaller than 8 bit. The legalizer can widenScalar G_LOAD into an
any-extending load but sign/zero-extending loads need help from something
else like a combiner pass. A follow-up patch that adds combiner helpers for
for this will follow.
The new representation requires that the MMO correctly reflect the memory
access so this has been corrected in a couple tests. I've also moved the
extending loads to their own tests since they are (mostly) separate opcodes
now. Additionally, the re-write appears to have invalidated two tests from
select-with-no-legality-check.mir since the matcher table no longer contains
loads that result in s1's and they aren't legal in AArch64 anymore.
Depends on D45540
Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar
Reviewed By: rtereshin
Subscribers: javed.absar, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D45541
llvm-svn: 331601
2018-05-06 04:53:24 +08:00
|
|
|
|
|
|
|
def SEXTLOAD : I<(outs GPR32:$dst), (ins GPR32:$src1),
|
|
|
|
[(set GPR32:$dst, (sextloadi16 GPR32:$src1))]>;
|
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
//===- Test a simple pattern with regclass operands. ----------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
2018-02-17 06:37:15 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_ADD,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID
|
|
|
|
// NOOPT-NEXT: // MIs[0] src2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // (add:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2) => (ADD:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2)
|
|
|
|
// NOOPT-NEXT: GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/MyTarget::ADD,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 13,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-02-17 06:37:15 +08:00
|
|
|
|
|
|
|
def ADD : I<(outs GPR32:$dst), (ins GPR32:$src1, GPR32:$src2),
|
|
|
|
[(set GPR32:$dst, (add GPR32:$src1, GPR32:$src2))]>;
|
|
|
|
|
|
|
|
//===- Test a pattern with a tied operand in the matcher ------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
2018-02-17 06:37:15 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_ADD,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src{{$}}
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src{{$}}
|
|
|
|
// NOOPT-NEXT: GIM_CheckIsSameOperand, /*MI*/0, /*OpIdx*/2, /*OtherMI*/0, /*OtherOpIdx*/1,
|
|
|
|
// NOOPT-NEXT: // (add:{ *:[i32] } GPR32:{ *:[i32] }:$src, GPR32:{ *:[i32] }:$src) => (DOUBLE:{ *:[i32] } GPR32:{ *:[i32] }:$src)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::DOUBLE,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 14,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-02-17 06:37:15 +08:00
|
|
|
|
|
|
|
def DOUBLE : I<(outs GPR32:$dst), (ins GPR32:$src), [(set GPR32:$dst, (add GPR32:$src, GPR32:$src))]>;
|
|
|
|
|
|
|
|
//===- Test a simple pattern with ValueType operands. ----------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
2018-02-17 06:37:15 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_ADD,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: // (add:{ *:[i32] } i32:{ *:[i32] }:$src1, i32:{ *:[i32] }:$src2) => (ADD:{ *:[i32] } i32:{ *:[i32] }:$src1, i32:{ *:[i32] }:$src2)
|
|
|
|
// NOOPT-NEXT: GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/MyTarget::ADD,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
Add support for pointer types in patterns
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
2019-02-21 03:43:47 +08:00
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 24,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-02-17 06:37:15 +08:00
|
|
|
|
|
|
|
def : Pat<(add i32:$src1, i32:$src2),
|
|
|
|
(ADD i32:$src1, i32:$src2)>;
|
|
|
|
|
|
|
|
//===- Test another simple pattern with regclass operands. ----------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckFeatures, GIFBS_HasA_HasB_HasC,
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
|
2018-02-17 06:37:15 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_MUL,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src2
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // (mul:{ *:[i32] } GPR32:{ *:[i32] }:$src1, GPR32:{ *:[i32] }:$src2) => (MUL:{ *:[i32] } GPR32:{ *:[i32] }:$src2, GPR32:{ *:[i32] }:$src1)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MUL,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // src2
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/1, // src1
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 15,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-02-17 06:37:15 +08:00
|
|
|
|
|
|
|
def MUL : I<(outs GPR32:$dst), (ins GPR32:$src2, GPR32:$src1),
|
|
|
|
[(set GPR32:$dst, (mul GPR32:$src1, GPR32:$src2))]>,
|
|
|
|
Requires<[HasA, HasB, HasC]>;
|
|
|
|
|
|
|
|
//===- Test a COPY_TO_REGCLASS --------------------------------------------===//
|
|
|
|
//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
2018-02-17 06:37:15 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_BITCAST,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] src1
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/MyTarget::FPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // (bitconvert:{ *:[i32] } FPR32:{ *:[f32] }:$src1) => (COPY_TO_REGCLASS:{ *:[i32] } FPR32:{ *:[f32] }:$src1, GPR32:{ *:[i32] })
|
|
|
|
// NOOPT-NEXT: GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/TargetOpcode::COPY,
|
2020-07-14 01:14:29 +08:00
|
|
|
// NOOPT-NEXT: GIR_ConstrainOperandRC, /*InsnID*/0, /*Op*/0, MyTarget::GPR32RegClassID,
|
Add support for pointer types in patterns
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
2019-02-21 03:43:47 +08:00
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 25,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2018-02-17 06:37:15 +08:00
|
|
|
|
|
|
|
def : Pat<(i32 (bitconvert FPR32:$src1)),
|
|
|
|
(COPY_TO_REGCLASS FPR32:$src1, GPR32)>;
|
|
|
|
|
|
|
|
//===- Test a simple pattern with just a leaf immediate. ------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
|
2018-02-17 06:37:15 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_CONSTANT,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] dst
|
|
|
|
// NOOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
|
|
|
|
// NOOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/MyTarget::GPR32RegClassID,
|
|
|
|
// NOOPT-NEXT: // MIs[0] Operand 1
|
|
|
|
// NOOPT-NEXT: // No operand predicates
|
|
|
|
// NOOPT-NEXT: // (imm:{ *:[i32] }):$imm => (MOVimm:{ *:[i32] } (imm:{ *:[i32] }):$imm)
|
|
|
|
// NOOPT-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/MyTarget::MOVimm,
|
|
|
|
// NOOPT-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // dst
|
|
|
|
// NOOPT-NEXT: GIR_CopyConstantAsSImm, /*NewInsnID*/0, /*OldInsnID*/0, // imm
|
|
|
|
// NOOPT-NEXT: GIR_EraseFromParent, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 16,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2017-11-11 11:23:44 +08:00
|
|
|
|
2018-02-17 06:37:15 +08:00
|
|
|
def MOVimm : I<(outs GPR32:$dst), (ins i32imm:$imm), [(set GPR32:$dst, imm:$imm)]>;
|
|
|
|
|
|
|
|
def fpimmz : FPImmLeaf<f32, [{ return Imm->isExactlyValue(0.0); }]>;
|
|
|
|
def MOVfpimmz : I<(outs FPR32:$dst), (ins f32imm:$imm), [(set FPR32:$dst, fpimmz:$imm)]>;
|
|
|
|
|
2017-11-11 11:23:44 +08:00
|
|
|
//===- Test a pattern with an MBB operand. --------------------------------===//
|
2018-05-22 07:28:51 +08:00
|
|
|
//
|
|
|
|
// NOOPT-NEXT: GIM_Try, /*On fail goto*//*Label [[LABEL_NUM:[0-9]+]]*/ [[LABEL:[0-9]+]],
|
|
|
|
// NOOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/1,
|
[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
2017-12-19 03:47:41 +08:00
|
|
|
// NOOPT-NEXT: GIM_CheckOpcode, /*MI*/0, TargetOpcode::G_BR,
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: // MIs[0] target
|
|
|
|
// NOOPT-NEXT: GIM_CheckIsMBB, /*MI*/0, /*Op*/0,
|
|
|
|
// NOOPT-NEXT: // (br (bb:{ *:[Other] }):$target) => (BR (bb:{ *:[Other] }):$target)
|
|
|
|
// NOOPT-NEXT: GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/MyTarget::BR,
|
|
|
|
// NOOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
|
|
|
|
// NOOPT-NEXT: // GIR_Coverage, 18,
|
|
|
|
// NOOPT-NEXT: GIR_Done,
|
|
|
|
// NOOPT-NEXT: // Label [[LABEL_NUM]]: @[[LABEL]]
|
2017-08-08 18:44:31 +08:00
|
|
|
|
|
|
|
def BR : I<(outs), (ins unknown:$target),
|
|
|
|
[(br bb:$target)]>;
|
|
|
|
|
2018-05-22 07:28:51 +08:00
|
|
|
// NOOPT-NEXT: GIM_Reject,
|
|
|
|
// NOOPT-NEXT: };
|
|
|
|
// NOOPT-NEXT: return MatchTable0;
|