llvm-project/llvm/lib/Target/AMDGPU
Stanislav Mekhanoshin 555d8f4ef5 [AMDGPU] Bundle loads before post-RA scheduler
We are relying on atrificial DAG edges inserted by the
MemOpClusterMutation to keep loads and stores together in the
post-RA scheduler. This does not work all the time since it
allows to schedule a completely independent instruction in the
middle of the cluster.

Removed the DAG mutation and added pass to bundle already
clustered instructions. These bundles are unpacked before the
memory legalizer because it does not work with bundles but also
because it allows to insert waitcounts in the middle of a store
cluster.

Removing artificial edges also allows a more relaxed scheduling.

Differential Revision: https://reviews.llvm.org/D72737
2020-01-24 11:33:38 -08:00
..
AsmParser CMake: Make most target symbols hidden by default 2020-01-14 19:46:52 -08:00
Disassembler CMake: Make most target symbols hidden by default 2020-01-14 19:46:52 -08:00
MCTargetDesc CMake: Make most target symbols hidden by default 2020-01-14 19:46:52 -08:00
TargetInfo CMake: Make most target symbols hidden by default 2020-01-14 19:46:52 -08:00
Utils AMDGPU/R600: Emit rodata in text segment 2020-01-22 14:31:51 -05:00
AMDGPU.h [AMDGPU] Bundle loads before post-RA scheduler 2020-01-24 11:33:38 -08:00
AMDGPU.td [AMDGPU] w/a for gfx908 mfma SrcC literal HW bug 2019-08-23 22:09:58 +00:00
AMDGPUAliasAnalysis.cpp AMDGPU: Improve alias analysis for GDS 2019-07-17 11:22:19 +00:00
AMDGPUAliasAnalysis.h
AMDGPUAlwaysInlinePass.cpp AMDGPU: Simplify getAddressSpace calls 2019-10-31 07:51:38 -07:00
AMDGPUAnnotateKernelFeatures.cpp Use llvm::StringLiteral instead of StringRef in few places 2019-09-20 14:31:42 +00:00
AMDGPUAnnotateUniformValues.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
AMDGPUArgumentUsageInfo.cpp [AMDGPU] Packed thread ids in function call ABI 2019-06-28 01:52:13 +00:00
AMDGPUArgumentUsageInfo.h AMDGPU: Fix Register copypaste error 2019-09-05 23:07:10 +00:00
AMDGPUAsmPrinter.cpp CMake: Make most target symbols hidden by default 2020-01-14 19:46:52 -08:00
AMDGPUAsmPrinter.h [AMDGPU] separate accounting for agprs 2019-10-02 00:26:58 +00:00
AMDGPUAtomicOptimizer.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
AMDGPUCallLowering.cpp AMDGPU/GlobalISel: Fix argument lowering for vectors of pointers 2020-01-09 16:29:44 -05:00
AMDGPUCallLowering.h AMDGPU/GlobalISel: Rename MIRBuilder to B. NFC 2019-09-09 23:06:13 +00:00
AMDGPUCallingConv.td [AMDGPU] Adjust number of SGPRs available in Calling Convention 2019-08-28 15:00:45 +00:00
AMDGPUCodeGenPrepare.cpp AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare 2020-01-23 16:57:43 -08:00
AMDGPUCombine.td AMDGPU/GlobalISel: Add pre-legalize combiner pass 2020-01-22 10:16:39 -05:00
AMDGPUFeatures.td
AMDGPUFixFunctionBitcasts.cpp
AMDGPUFrameLowering.cpp Use Align for TFL::TransientStackAlignment 2019-10-21 08:31:25 +00:00
AMDGPUFrameLowering.h [Alignment][NFC] Deprecate Align::None() 2020-01-24 12:53:58 +01:00
AMDGPUGISel.td AMDGPU/GlobalISel: Remove redundant or patterns 2020-01-22 21:45:51 -05:00
AMDGPUGenRegisterBankInfo.def AMDGPU/GlobalISel: Fix RegBankSelect for G_INSERT_VECTOR_ELT 2020-01-22 10:57:50 -05:00
AMDGPUGlobalISelUtils.cpp AMDGPU/GlobalISel: Add new utils file 2020-01-03 15:25:50 -05:00
AMDGPUGlobalISelUtils.h AMDGPU/GlobalISel: Add new utils file 2020-01-03 15:25:50 -05:00
AMDGPUHSAMetadataStreamer.cpp [AMDGPU] add support for hostcall buffer pointer as hidden kernel argument 2019-11-20 15:53:55 +05:30
AMDGPUHSAMetadataStreamer.h [llvm] Migrate llvm::make_unique to std::make_unique 2019-08-15 15:54:37 +00:00
AMDGPUISelDAGToDAG.cpp AMDGPU: Remove VOP3Mods0Clamp0OMod 2020-01-07 15:10:08 -05:00
AMDGPUISelLowering.cpp [AMDGPU] Allow narrowing muti-dword loads 2020-01-24 11:03:41 -08:00
AMDGPUISelLowering.h AMDGPU: Remove custom node for exports 2020-01-15 18:33:15 -05:00
AMDGPUInline.cpp [NFC] Refactor InlineResult for readability 2020-01-15 13:34:20 -08:00
AMDGPUInstrInfo.cpp
AMDGPUInstrInfo.h
AMDGPUInstrInfo.td AMDGPU: Remove custom node for exports 2020-01-15 18:33:15 -05:00
AMDGPUInstructionSelector.cpp AMDGPU/GlobalISel: Handle 16-bank LDS llvm.amdgcn.interp.p1.f16 2020-01-22 12:10:59 -05:00
AMDGPUInstructionSelector.h AMDGPU/GlobalISel: Handle 16-bank LDS llvm.amdgcn.interp.p1.f16 2020-01-22 12:10:59 -05:00
AMDGPUInstructions.td AMDGPU/GlobalISel: Fix import of integer med3 2020-01-09 10:29:32 -05:00
AMDGPULegalizerInfo.cpp AMDGPU/GlobalISel: Handle atomic_inc/atomic_dec 2020-01-22 09:26:17 -05:00
AMDGPULegalizerInfo.h AMDGPU/GlobalISel: Handle atomic_inc/atomic_dec 2020-01-22 09:26:17 -05:00
AMDGPULibCalls.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
AMDGPULibFunc.cpp [AMDGPU] Downgrade from StringLiteral to const char* in an attempt to make GCC 5 happy 2019-08-25 12:47:31 +00:00
AMDGPULibFunc.h
AMDGPULowerIntrinsics.cpp
AMDGPULowerKernelArguments.cpp [Alignement][NFC] Deprecate untyped CreateAlignedLoad 2020-01-23 13:34:32 +01:00
AMDGPULowerKernelAttributes.cpp
AMDGPUMCInstLower.cpp [MC] Add parameter `Address` to MCInstPrinter::printInst 2020-01-06 20:42:22 -08:00
AMDGPUMachineCFGStructurizer.cpp [AMDGPU] Fixes -Wrange-loop-analysis warnings 2019-12-22 19:39:28 +01:00
AMDGPUMachineFunction.cpp AMDGPU: Refactor treatment of denormal mode 2019-11-19 19:55:43 +05:30
AMDGPUMachineFunction.h AMDGPU: Refactor treatment of denormal mode 2019-11-19 19:55:43 +05:30
AMDGPUMachineModuleInfo.cpp
AMDGPUMachineModuleInfo.h
AMDGPUMacroFusion.cpp
AMDGPUMacroFusion.h
AMDGPUOpenCLEnqueuedBlockLowering.cpp Fix parameter name comments using clang-tidy. NFC. 2019-07-16 04:46:31 +00:00
AMDGPUPTNote.h
AMDGPUPerfHintAnalysis.cpp AMDGPU: Fix assert in clang test 2019-07-05 21:09:53 +00:00
AMDGPUPerfHintAnalysis.h AMDGPU: Make AMDGPUPerfHintAnalysis an SCC pass 2019-07-05 20:26:13 +00:00
AMDGPUPreLegalizerCombiner.cpp AMDGPU/GlobalISel: Add pre-legalize combiner pass 2020-01-22 10:16:39 -05:00
AMDGPUPrintfRuntimeBinding.cpp [AMDGPU] add support for hostcall buffer pointer as hidden kernel argument 2019-11-20 15:53:55 +05:30
AMDGPUPromoteAlloca.cpp [Alignement][NFC] Deprecate untyped CreateAlignedLoad 2020-01-23 13:34:32 +01:00
AMDGPUPropagateAttributes.cpp AMDGPU: Move DEBUG_TYPE definition below includes 2019-07-08 18:48:39 +00:00
AMDGPURegisterBankInfo.cpp AMDGPU/GlobalISel: Fix RegBanKSelect for llvm.amdgcn.exp.compr 2020-01-23 13:30:46 -08:00
AMDGPURegisterBankInfo.h AMDGPU/GlobalISel: Fix G_EXTRACT_VECTOR_ELT mapping for s-v case 2020-01-09 19:46:54 -05:00
AMDGPURegisterBanks.td AMDGPU/GlobalISel: Replace handling of boolean values 2020-01-06 18:26:42 -05:00
AMDGPURegisterInfo.cpp AMDGPU: Remove outdated comment 2020-01-16 14:54:27 -05:00
AMDGPURegisterInfo.h AMDGPU/GlobalISel: Handle more G_INSERT cases 2019-10-07 19:16:26 +00:00
AMDGPURegisterInfo.td [AMDGPU] gfx908 register file changes 2019-07-09 19:41:51 +00:00
AMDGPURewriteOutArguments.cpp [Alignment][NFC] Use Align with CreateAlignedStore 2020-01-23 17:34:32 +01:00
AMDGPUSearchableTables.td [AMDGPU][SILoadStoreOptimizer] Merge TBUFFER loads/stores 2019-11-20 22:59:30 +01:00
AMDGPUSubtarget.cpp [AMDGPU] Bundle loads before post-RA scheduler 2020-01-24 11:33:38 -08:00
AMDGPUSubtarget.h [AMDGPU] allow multi-dword flat scratch access since GFX9 2020-01-17 10:47:03 -08:00
AMDGPUTargetMachine.cpp [AMDGPU] Bundle loads before post-RA scheduler 2020-01-24 11:33:38 -08:00
AMDGPUTargetMachine.h
AMDGPUTargetObjectFile.cpp
AMDGPUTargetObjectFile.h
AMDGPUTargetTransformInfo.cpp Resubmit: [DA][TTI][AMDGPU] Add option to select GPUDA with TTI 2020-01-24 10:39:40 -08:00
AMDGPUTargetTransformInfo.h Resubmit: [DA][TTI][AMDGPU] Add option to select GPUDA with TTI 2020-01-24 10:39:40 -08:00
AMDGPUUnifyDivergentExitNodes.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
AMDGPUUnifyMetadata.cpp [AMDGPU] Fixes -Wrange-loop-analysis warnings 2019-12-22 19:39:28 +01:00
AMDILCFGStructurizer.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
AMDKernelCodeT.h [AMDGPU] gfx1010 wave32 metadata 2019-06-17 16:48:56 +00:00
BUFInstructions.td AMDGPU: Add register classes to MUBUF load patterns 2020-01-16 22:00:44 -05:00
CMakeLists.txt [AMDGPU] Bundle loads before post-RA scheduler 2020-01-24 11:33:38 -08:00
CaymanInstructions.td
DSInstructions.td TableGen/GlobalISel: Add way for SDNodeXForm to work on timm 2020-01-09 17:37:52 -05:00
EvergreenInstructions.td AMDGPU: Start redefining atomic PatFrags 2019-08-01 03:25:52 +00:00
FLATInstructions.td AMDGPU: Eliminate more legacy codepred address space PatFrags 2020-01-09 10:29:32 -05:00
GCNDPPCombine.cpp [AMDGPU][DPP] Corrected DPP combiner 2019-11-20 15:56:45 +03:00
GCNHazardRecognizer.cpp Make more use of MachineInstr::mayLoadOrStore. 2019-12-19 11:51:52 +00:00
GCNHazardRecognizer.h [AMDGPU] gfx908 hazard recognizer 2019-07-11 21:30:34 +00:00
GCNILPSched.cpp Prune a LegacyDivergenceAnalysis and MachineLoopInfo include each 2019-10-19 01:31:09 +00:00
GCNIterativeScheduler.cpp [llvm] Migrate llvm::make_unique to std::make_unique 2019-08-15 15:54:37 +00:00
GCNIterativeScheduler.h
GCNMinRegStrategy.cpp
GCNNSAReassign.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
GCNProcessors.td [AMDGPU] gfx908 target 2019-07-09 18:10:06 +00:00
GCNRegBankReassign.cpp AMDGPU: Fix ubsan error 2020-01-23 15:05:47 -05:00
GCNRegPressure.cpp Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM 2019-08-15 19:22:08 +00:00
GCNRegPressure.h Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC 2019-08-01 23:27:28 +00:00
GCNSchedStrategy.cpp [AMDGPU] Revert scheduling to reduce spilling 2020-01-03 15:20:21 -08:00
GCNSchedStrategy.h AMDGPU: Avoid constructing new std::vector in initCandidate 2019-09-05 22:44:06 +00:00
LLVMBuild.txt [AMDGPU] Move InstPrinter files to MCTargetDesc. NFC 2019-05-11 00:03:35 +00:00
MIMGInstructions.td [AMDGPU] deduplicate tablegen predicates 2019-11-04 12:19:17 -08:00
R600.td
R600AsmPrinter.cpp [NFC] Fix trivial typos in comments 2020-01-06 10:50:26 +00:00
R600AsmPrinter.h
R600ClauseMergePass.cpp
R600ControlFlowFinalizer.cpp Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM 2019-08-15 19:22:08 +00:00
R600Defines.h
R600EmitClauseMarkers.cpp
R600ExpandSpecialInstrs.cpp Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM 2019-08-15 19:22:08 +00:00
R600FrameLowering.cpp
R600FrameLowering.h [Alignment][NFC] Deprecate Align::None() 2020-01-24 12:53:58 +01:00
R600ISelLowering.cpp [DAG] Add helper for creating constant vector index with correct type. NFC. 2020-01-18 01:23:36 -05:00
R600ISelLowering.h [TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests (PR42123) 2019-06-12 17:14:03 +00:00
R600InstrFormats.td
R600InstrInfo.cpp Update spelling of {analyze,insert,remove}Branch in strings and comments 2020-01-21 10:15:38 -06:00
R600InstrInfo.h Use MCRegister in copyPhysReg 2019-11-11 14:42:33 +05:30
R600Instructions.td AMDGPU: Eliminate more legacy codepred address space PatFrags 2020-01-09 10:29:32 -05:00
R600MachineFunctionInfo.cpp
R600MachineFunctionInfo.h
R600MachineScheduler.cpp Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM 2019-08-15 19:22:08 +00:00
R600MachineScheduler.h
R600OpenCLImageTypeLoweringPass.cpp
R600OptimizeVectorRegisters.cpp Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM 2019-08-15 19:22:08 +00:00
R600Packetizer.cpp Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM 2019-08-15 19:22:08 +00:00
R600Processors.td
R600RegisterInfo.cpp Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC 2019-08-01 23:27:28 +00:00
R600RegisterInfo.h [TargetRegisterInfo] Default trackLivenessAfterRegAlloc() to true 2020-01-19 14:20:37 -08:00
R600RegisterInfo.td
R600Schedule.td
R700Instructions.td
SIAddIMGInit.cpp Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM 2019-08-15 19:22:08 +00:00
SIAnnotateControlFlow.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
SIDefines.h [AMDGPU] Added MI bit IsDOT 2019-09-17 17:56:13 +00:00
SIFixSGPRCopies.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
SIFixVGPRCopies.cpp
SIFixupVectorISel.cpp Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC 2019-08-01 23:27:28 +00:00
SIFoldOperands.cpp [amdgpu] Remove unused header. NFC. 2020-01-08 11:32:09 -05:00
SIFormMemoryClauses.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
SIFrameLowering.cpp [AMDGPU] Don't create MachinePointerInfos with an UndefValue pointer 2019-12-23 15:58:19 +00:00
SIFrameLowering.h [Alignment][NFC] Deprecate Align::None() 2020-01-24 12:53:58 +01:00
SIISelLowering.cpp AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare 2020-01-23 16:57:43 -08:00
SIISelLowering.h CodeGen: Use LLT instead of EVT in getRegisterByName 2020-01-09 17:37:52 -05:00
SIInsertSkips.cpp Resubmit: [AMDGPU] Invert the handling of skip insertion. 2020-01-22 13:18:32 +09:00
SIInsertWaitcnts.cpp [AMDGPU] need to insert wait between the scalar load and vector store to the same address to avoid WAR conflict. 2020-01-04 18:23:14 +03:00
SIInstrFormats.td [AMDGPU] Added MI bit IsDOT 2019-09-17 17:56:13 +00:00
SIInstrInfo.cpp [AMDGPU] Bundle loads before post-RA scheduler 2020-01-24 11:33:38 -08:00
SIInstrInfo.h AMDGPU/GlobalISel: Select G_INSERT_VECTOR_ELT 2020-01-22 11:00:49 -05:00
SIInstrInfo.td AMDGPU/GlobalISel: Select llvm.amdgcn.update.dpp 2020-01-17 20:09:53 -05:00
SIInstructions.td AMDGPU/GlobalISel: Select llvm.amdgcn.mov.dpp 2020-01-22 11:43:53 -05:00
SILoadStoreOptimizer.cpp AMDGPU/SILoadStoreOptimillzer: Refactor CombineInfo struct 2019-12-17 13:43:10 -08:00
SILowerControlFlow.cpp Resubmit: [AMDGPU] Invert the handling of skip insertion. 2020-01-22 13:18:32 +09:00
SILowerI1Copies.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
SILowerSGPRSpills.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
SIMachineFunctionInfo.cpp AMDGPU: Refactor treatment of denormal mode 2019-11-19 19:55:43 +05:30
SIMachineFunctionInfo.h AMDGPU: Refactor treatment of denormal mode 2019-11-19 19:55:43 +05:30
SIMachineScheduler.cpp AMDGPU/SI: make ~SIScheduleBlockCreator trivial 2019-11-11 21:51:59 -08:00
SIMachineScheduler.h AMDGPU/SI: make ~SIScheduleBlockCreator trivial 2019-11-11 21:51:59 -08:00
SIMemoryLegalizer.cpp [AMDGPU] Bundle loads before post-RA scheduler 2020-01-24 11:33:38 -08:00
SIModeRegister.cpp [llvm] Migrate llvm::make_unique to std::make_unique 2019-08-15 15:54:37 +00:00
SIOptimizeExecMasking.cpp AMDGPU: Use Register 2019-12-27 16:53:21 -05:00
SIOptimizeExecMaskingPreRA.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
SIPeepholeSDWA.cpp AMDGPU: Fixed indeterminate map iteration in SIPeepholeSDWA 2019-12-02 12:08:49 +00:00
SIPostRABundler.cpp [AMDGPU] Bundle loads before post-RA scheduler 2020-01-24 11:33:38 -08:00
SIPreAllocateWWMRegs.cpp Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
SIProgramInfo.h [AMDGPU] separate accounting for agprs 2019-10-02 00:26:58 +00:00
SIRegisterInfo.cpp [TargetRegisterInfo] Default trackLivenessAfterRegAlloc() to true 2020-01-19 14:20:37 -08:00
SIRegisterInfo.h [TargetRegisterInfo] Default trackLivenessAfterRegAlloc() to true 2020-01-19 14:20:37 -08:00
SIRegisterInfo.td AMDGPU: Make VReg_1 only include 1 artificial register 2019-10-28 20:51:51 -07:00
SIRemoveShortExecBranches.cpp [AMDGPU] SIRemoveShortExecBranches should not remove branches exiting loops 2020-01-22 13:18:40 +09:00
SISchedule.td [AMDGPU] gfx908 scheduling 2019-07-11 21:25:00 +00:00
SIShrinkInstructions.cpp AMDGPU: Don't fold S_NOPs with implicit operands 2019-10-30 14:40:56 -07:00
SIWholeQuadMode.cpp [AMDGPU] Remove unnecessary v_mov from a register to itself in WQM lowering. 2020-01-10 23:01:19 -05:00
SMInstructions.td [AMDGPU] deduplicate tablegen predicates 2019-11-04 12:19:17 -08:00
SOPInstructions.td AMDGPU: Prepare to use scalar register indexing 2020-01-20 17:19:16 -05:00
VIInstrFormats.td
VIInstructions.td
VOP1Instructions.td AMDGPU/GlobalISel: Select llvm.amdgcn.mov.dpp 2020-01-22 11:43:53 -05:00
VOP2Instructions.td AMDGPU/GlobalISel: Fix import of zext of s16 op patterns 2020-01-09 10:29:32 -05:00
VOP3Instructions.td AMDGPU/GlobalISel: Select V_ADD3_U32/V_XOR3_B32 2020-01-23 12:04:20 -05:00
VOP3PInstructions.td [AMDGPU] deduplicate tablegen predicates 2019-11-04 12:19:17 -08:00
VOPCInstructions.td AMDGPU: Remove VOP3Mods0Clamp0OMod 2020-01-07 15:10:08 -05:00
VOPInstructions.td [AMDGPU] copy OtherPredicates from pseudo to VOP3_Real 2019-09-26 21:06:17 +00:00