llvm-project/llvm/lib/Target/PowerPC/PPCSchedule.td

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

144 lines
5.4 KiB
TableGen
Raw Normal View History

//===-- PPCSchedule.td - PowerPC Scheduling Definitions ----*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//===----------------------------------------------------------------------===//
// Instruction Itinerary classes used for PowerPC
//
def IIC_IntSimple : InstrItinClass;
def IIC_IntGeneral : InstrItinClass;
def IIC_IntCompare : InstrItinClass;
def IIC_IntISEL : InstrItinClass;
def IIC_IntDivD : InstrItinClass;
def IIC_IntDivW : InstrItinClass;
def IIC_IntMFFS : InstrItinClass;
def IIC_IntMFVSCR : InstrItinClass;
def IIC_IntMTFSB0 : InstrItinClass;
def IIC_IntMTSRD : InstrItinClass;
def IIC_IntMulHD : InstrItinClass;
def IIC_IntMulHW : InstrItinClass;
def IIC_IntMulHWU : InstrItinClass;
def IIC_IntMulLI : InstrItinClass;
def IIC_IntRFID : InstrItinClass;
def IIC_IntRotateD : InstrItinClass;
def IIC_IntRotateDI : InstrItinClass;
def IIC_IntRotate : InstrItinClass;
def IIC_IntShift : InstrItinClass;
def IIC_IntTrapD : InstrItinClass;
def IIC_IntTrapW : InstrItinClass;
def IIC_BrB : InstrItinClass;
def IIC_BrCR : InstrItinClass;
def IIC_BrMCR : InstrItinClass;
def IIC_BrMCRX : InstrItinClass;
def IIC_LdStDCBA : InstrItinClass;
def IIC_LdStDCBF : InstrItinClass;
def IIC_LdStDCBI : InstrItinClass;
def IIC_LdStLoad : InstrItinClass;
def IIC_LdStLoadUpd : InstrItinClass;
def IIC_LdStLoadUpdX : InstrItinClass;
def IIC_LdStStore : InstrItinClass;
def IIC_LdStDSS : InstrItinClass;
def IIC_LdStICBI : InstrItinClass;
def IIC_LdStLD : InstrItinClass;
def IIC_LdStLDU : InstrItinClass;
def IIC_LdStLDUX : InstrItinClass;
def IIC_LdStLDARX : InstrItinClass;
def IIC_LdStLFD : InstrItinClass;
def IIC_LdStLFDU : InstrItinClass;
def IIC_LdStLFDUX : InstrItinClass;
def IIC_LdStLHA : InstrItinClass;
def IIC_LdStLHAU : InstrItinClass;
def IIC_LdStLHAUX : InstrItinClass;
def IIC_LdStLMW : InstrItinClass;
def IIC_LdStLQ : InstrItinClass;
def IIC_LdStLQARX : InstrItinClass;
def IIC_LdStLVecX : InstrItinClass;
def IIC_LdStLWA : InstrItinClass;
def IIC_LdStLWARX : InstrItinClass;
def IIC_LdStSLBIA : InstrItinClass;
def IIC_LdStSLBIE : InstrItinClass;
def IIC_LdStSTD : InstrItinClass;
def IIC_LdStSTDCX : InstrItinClass;
def IIC_LdStSTQ : InstrItinClass;
def IIC_LdStSTQCX : InstrItinClass;
def IIC_LdStSTU : InstrItinClass;
def IIC_LdStSTUX : InstrItinClass;
def IIC_LdStSTFD : InstrItinClass;
def IIC_LdStSTFDU : InstrItinClass;
def IIC_LdStSTVEBX : InstrItinClass;
def IIC_LdStSTWCX : InstrItinClass;
def IIC_LdStSync : InstrItinClass;
def IIC_LdStCOPY : InstrItinClass;
def IIC_LdStPASTE : InstrItinClass;
def IIC_SprISYNC : InstrItinClass;
def IIC_SprMFSR : InstrItinClass;
def IIC_SprMTMSR : InstrItinClass;
def IIC_SprMTSR : InstrItinClass;
def IIC_SprTLBSYNC : InstrItinClass;
def IIC_SprMFCR : InstrItinClass;
def IIC_SprMFCRF : InstrItinClass;
def IIC_SprMFMSR : InstrItinClass;
def IIC_SprMFSPR : InstrItinClass;
def IIC_SprMFTB : InstrItinClass;
def IIC_SprMTSPR : InstrItinClass;
def IIC_SprMTSRIN : InstrItinClass;
def IIC_SprRFI : InstrItinClass;
def IIC_SprSC : InstrItinClass;
def IIC_FPGeneral : InstrItinClass;
def IIC_FPDGeneral : InstrItinClass;
def IIC_FPSGeneral : InstrItinClass;
def IIC_FPAddSub : InstrItinClass;
def IIC_FPCompare : InstrItinClass;
def IIC_FPDivD : InstrItinClass;
def IIC_FPDivS : InstrItinClass;
def IIC_FPFused : InstrItinClass;
def IIC_FPRes : InstrItinClass;
def IIC_FPSqrtD : InstrItinClass;
def IIC_FPSqrtS : InstrItinClass;
def IIC_VecGeneral : InstrItinClass;
def IIC_VecFP : InstrItinClass;
def IIC_VecFPCompare : InstrItinClass;
def IIC_VecComplex : InstrItinClass;
def IIC_VecPerm : InstrItinClass;
def IIC_VecFPRound : InstrItinClass;
def IIC_VecVSL : InstrItinClass;
def IIC_VecVSR : InstrItinClass;
def IIC_SprMTMSRD : InstrItinClass;
def IIC_SprSLIE : InstrItinClass;
def IIC_SprSLBFEE : InstrItinClass;
def IIC_SprSLBIE : InstrItinClass;
def IIC_SprSLBIEG : InstrItinClass;
def IIC_SprSLBMTE : InstrItinClass;
def IIC_SprSLBMFEE : InstrItinClass;
def IIC_SprSLBMFEV : InstrItinClass;
def IIC_SprSLBIA : InstrItinClass;
def IIC_SprSLBSYNC : InstrItinClass;
2014-08-03 04:16:29 +08:00
def IIC_SprTLBIA : InstrItinClass;
def IIC_SprTLBIEL : InstrItinClass;
def IIC_SprTLBIE : InstrItinClass;
def IIC_SprABORT : InstrItinClass;
def IIC_SprMSGSYNC : InstrItinClass;
def IIC_SprSTOP : InstrItinClass;
def IIC_SprMFPMR : InstrItinClass;
def IIC_SprMTPMR : InstrItinClass;
//===----------------------------------------------------------------------===//
// Processor instruction itineraries.
include "PPCScheduleG3.td"
include "PPCSchedule440.td"
include "PPCScheduleG4.td"
include "PPCScheduleG4Plus.td"
include "PPCScheduleG5.td"
Add a scheduling model (with itinerary) for the PPC POWER7 This adds a scheduling model for the POWER7 (P7) core, and enables the machine-instruction scheduler when targeting the P7. Scheduling for the P7, like earlier ooo PPC cores, requires considering both dispatch group hazards, and functional unit resources and latencies. These are both modeled in a combined itinerary. Dispatch group formation is still handled by the post-RA scheduler (which still needs to be updated for the P7, but nevertheless does a pretty good job). One interesting aspect of this change is that I've also enabled to use of AA duing CodeGen for the P7 (just as it is for the embedded cores). The benchmark results seem to support this decision (see below), and while this is normally useful for in-order cores, and not for ooo cores like the P7, I think that the dispatch slot hazards are enough like in-order resources to make the AA useful. Test suite significant performance differences (where negative is a speedup, and positive is a regression) vs. the current situation: MultiSource/Benchmarks/BitBench/drop3/drop3 with AA: N/A without AA: -28.7614% +/- 19.8356% (significantly against AA) MultiSource/Benchmarks/FreeBench/neural/neural with AA: -17.7406% +/- 11.2712% without AA: N/A (significantly in favor of AA) MultiSource/Benchmarks/SciMark2-C/scimark2 with AA: -11.2079% +/- 1.80543% without AA: -11.3263% +/- 2.79651% MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt with AA: -41.8649% +/- 17.0053% without AA: -34.5256% +/- 23.7072% MultiSource/Benchmarks/mafft/pairlocalalign with AA: 25.3016% +/- 17.8614% without AA: 38.6629% +/- 14.9391% (significantly in favor of AA) MultiSource/Benchmarks/sim/sim with AA: N/A without AA: 13.4844% +/- 7.18195% (significantly in favor of AA) SingleSource/Benchmarks/BenchmarkGame/Large/fasta with AA: 15.0664% +/- 6.70216% without AA: 12.7747% +/- 8.43043% SingleSource/Benchmarks/BenchmarkGame/puzzle with AA: 82.2713% +/- 26.3567% without AA: 75.7525% +/- 41.1842% SingleSource/Benchmarks/Misc/flops-2 with AA: -37.1621% +/- 20.7964% without AA: -35.2342% +/- 20.2999% (significantly in favor of AA) These are 99.5% confidence intervals from 5 runs per configuration. Regarding the choice to turn on AA during CodeGen, of these results, four seem significantly in favor of using AA, and one seems significantly against. I'm not making this decision based on these numbers alone, but these results seem consistent with results I have from other tests, and so I think that, on balance, using AA is a win. llvm-svn: 195981
2013-12-01 04:55:12 +08:00
include "PPCScheduleP7.td"
include "PPCScheduleP8.td"
include "PPCScheduleP9.td"
include "PPCScheduleA2.td"
include "PPCScheduleE500.td"
include "PPCScheduleE500mc.td"
include "PPCScheduleE5500.td"