llvm-project/llvm/lib/Target/AMDGPU
Nicolai Haehnle b4f28deda0 AMDGPU: Re-organize the outer loop of SILoadStoreOptimizer
Summary:
The entire algorithm operates per basic-block, so for cache locality
it should be better to re-optimize a basic-block immediately rather than
in a separate loop.

I don't have performance measurements.

Change-Id: I85106570bd623c4ff277faaa50ee43258e1ddcc5

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D40344

llvm-svn: 319156
2017-11-28 08:42:46 +00:00
..
AsmParser [AMDGPU][MC][GFX9] Added support of 'inst_offset' modifier for compatibility with SP3 2017-11-24 13:22:38 +00:00
Disassembler [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
InstPrinter [AMDGPU][MC][DISASSEMBLER][GFX9] Corrected decoding of GLOBAL/SCRATCH opcodes 2017-11-27 17:14:35 +00:00
MCTargetDesc [AMDGPU] Emit metadata for hidden arguments for kernel enqueue 2017-10-30 14:30:28 +00:00
TargetInfo Add backend name to Target to enable runtime info to be fed back into TableGen 2017-11-15 23:55:44 +00:00
Utils AMDGPU: Fix set but not used warnings related to AMDGPUAS 2017-11-01 19:12:38 +00:00
AMDGPU.h [AMDGPU] Clean up symbols in the global namespace. 2017-10-31 23:21:30 +00:00
AMDGPU.td AMDGPU: Don't use MUBUF vaddr if address may overflow 2017-11-15 00:45:43 +00:00
AMDGPUAliasAnalysis.cpp [AMDGPU] calling conventions for AMDPAL OS type 2017-09-29 09:51:22 +00:00
AMDGPUAliasAnalysis.h [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
AMDGPUAlwaysInlinePass.cpp AMDGPU: Add option to stress calls 2017-09-21 07:00:48 +00:00
AMDGPUAnnotateKernelFeatures.cpp [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC). 2017-08-31 21:56:16 +00:00
AMDGPUAnnotateUniformValues.cpp AMDGPU: Fix converting unanalyzable global loads to SMRD 2017-07-12 23:06:18 +00:00
AMDGPUArgumentUsageInfo.cpp AMDGPU: Fix implicitarg.ptr handling special inputs 2017-08-03 23:12:44 +00:00
AMDGPUArgumentUsageInfo.h [AMDGPU] Fixed MSVC build break 2017-08-04 10:53:07 +00:00
AMDGPUAsmPrinter.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
AMDGPUAsmPrinter.h AMDGPU: Error on stack size overflow 2017-11-14 20:33:14 +00:00
AMDGPUCallLowering.cpp AMDGPU: Pass special input registers to functions 2017-08-03 23:00:29 +00:00
AMDGPUCallLowering.h AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
AMDGPUCallingConv.td AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
AMDGPUCodeGenPrepare.cpp [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag 2017-11-06 16:27:15 +00:00
AMDGPUFrameLowering.cpp [AMDGPU] Split R600/SI getFrameIndexReference and emit stack object offsets for SI 2017-03-10 19:39:07 +00:00
AMDGPUFrameLowering.h Move TargetFrameLowering.h to CodeGen where it's implemented 2017-11-03 22:32:11 +00:00
AMDGPUGenRegisterBankInfo.def [GlobalISel] Make GlobalISel a non-optional library. 2017-08-03 21:52:25 +00:00
AMDGPUISelDAGToDAG.cpp AMDGPU: Replace i64 add/sub lowering 2017-11-15 21:51:43 +00:00
AMDGPUISelLowering.cpp [AMDGPU] Add custom lowering for llvm.log{,10}.{f16,f32} intrinsics 2017-11-27 13:26:38 +00:00
AMDGPUISelLowering.h [AMDGPU] Add custom lowering for llvm.log{,10}.{f16,f32} intrinsics 2017-11-27 13:26:38 +00:00
AMDGPUInline.cpp [AMDGPU] Port of HSAIL inliner 2017-09-20 04:25:58 +00:00
AMDGPUInstrInfo.cpp [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
AMDGPUInstrInfo.h Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering 2017-11-08 01:01:31 +00:00
AMDGPUInstrInfo.td Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ. 2017-10-12 19:37:14 +00:00
AMDGPUInstructionSelector.cpp [globalisel][tablegen] Generate rule coverage and use it to identify untested rules 2017-11-16 00:46:35 +00:00
AMDGPUInstructionSelector.h [globalisel][tablegen] Generate rule coverage and use it to identify untested rules 2017-11-16 00:46:35 +00:00
AMDGPUInstructions.td AMDGPU: Select d16 loads into low component of register 2017-11-13 00:22:09 +00:00
AMDGPUIntrinsicInfo.cpp Rename AttributeSet to AttributeList 2017-03-21 16:57:19 +00:00
AMDGPUIntrinsicInfo.h
AMDGPUIntrinsics.td AMDGPU: Remove legacy bfe intrinsics 2017-04-03 18:08:08 +00:00
AMDGPULegalizerInfo.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
AMDGPULegalizerInfo.h Re-commit AMDGPU/GlobalISel: Add support for simple shaders 2017-01-30 21:56:46 +00:00
AMDGPULibCalls.cpp Make helpers static. NFC. 2017-11-24 14:55:41 +00:00
AMDGPULibFunc.cpp [AMDGPU] Remove hardcoded address space value from AMDGPULibFunc 2017-11-04 17:37:43 +00:00
AMDGPULibFunc.h [AMDGPU] Remove hardcoded address space value from AMDGPULibFunc 2017-11-04 17:37:43 +00:00
AMDGPULowerIntrinsics.cpp Extend memcpy expansion in Transform/Utils to handle wider operand types. 2017-07-07 02:00:06 +00:00
AMDGPUMCInstLower.cpp Fix AMDGPU build issue 2017-10-11 23:53:36 +00:00
AMDGPUMCInstLower.h
AMDGPUMachineCFGStructurizer.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
AMDGPUMachineFunction.cpp AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
AMDGPUMachineFunction.h Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPUMachineModuleInfo.cpp AMDGPU: Implement memory model 2017-07-21 21:19:23 +00:00
AMDGPUMachineModuleInfo.h AMDGPU: Handle more than one memory operand in SIMemoryLegalizer 2017-09-07 16:14:21 +00:00
AMDGPUMacroFusion.cpp AMDGPU: Add macro fusion schedule DAG mutation 2017-07-06 20:57:05 +00:00
AMDGPUMacroFusion.h AMDGPU: Add macro fusion schedule DAG mutation 2017-07-06 20:57:05 +00:00
AMDGPUOpenCLEnqueuedBlockLowering.cpp AMDGPU: Fix set but not used warnings related to AMDGPUAS 2017-11-01 19:12:38 +00:00
AMDGPUOpenCLImageTypeLoweringPass.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
AMDGPUPTNote.h AMDGPU/NFC: Move AMDGPU specific note types to ELF.h 2017-10-12 18:59:54 +00:00
AMDGPUPromoteAlloca.cpp AMDGPU: Fix assert on alloca of array of struct 2017-09-14 18:02:29 +00:00
AMDGPURegAsmNames.inc.cpp AMDGPU: Work around build special casing .inc files 2017-06-08 19:25:21 +00:00
AMDGPURegisterBankInfo.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
AMDGPURegisterBankInfo.h Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPURegisterBanks.td Re-commit AMDGPU/GlobalISel: Add support for simple shaders 2017-01-30 21:56:46 +00:00
AMDGPURegisterInfo.cpp AMDGPU: Make frame register caller preserved 2017-09-14 17:14:57 +00:00
AMDGPURegisterInfo.h AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
AMDGPURegisterInfo.td AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files 2017-07-29 03:44:07 +00:00
AMDGPURewriteOutArguments.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
AMDGPUSubtarget.cpp AMDGPU: Don't use MUBUF vaddr if address may overflow 2017-11-15 00:45:43 +00:00
AMDGPUSubtarget.h AMDGPU: Move hazard avoidance out of waitcnt pass. 2017-11-17 21:35:32 +00:00
AMDGPUTargetMachine.cpp AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
AMDGPUTargetMachine.h Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine" 2017-10-12 22:57:28 +00:00
AMDGPUTargetObjectFile.cpp AMDGPU: Fix set but not used warnings related to AMDGPUAS 2017-11-01 19:12:38 +00:00
AMDGPUTargetObjectFile.h [AMDGPU] Get address space mapping by target triple environment 2017-03-27 14:04:01 +00:00
AMDGPUTargetTransformInfo.cpp [AMDGPU] calling conventions for AMDPAL OS type 2017-09-29 09:51:22 +00:00
AMDGPUTargetTransformInfo.h [AMDGPU] Port of HSAIL inliner 2017-09-20 04:25:58 +00:00
AMDGPUUnifyDivergentExitNodes.cpp [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). 2017-10-17 21:27:42 +00:00
AMDGPUUnifyMetadata.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-10 00:46:15 +00:00
AMDILCFGStructurizer.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
AMDKernelCodeT.h [AMDGPU] Revert r310429 changes in AMDKernelCodeT.h which broke some build bots. 2017-08-09 00:06:29 +00:00
BUFInstructions.td AMDGPU: Select d16 loads into low component of register 2017-11-13 00:22:09 +00:00
CMakeLists.txt AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
CaymanInstructions.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
DSInstructions.td AMDGPU: Add separate definitions for DS insts without m0 use 2017-11-15 01:34:06 +00:00
EvergreenInstructions.td AMDGPU: Cleanup local atomic node names 2017-10-23 17:16:43 +00:00
FLATInstructions.td [AMDGPU][MC][DISASSEMBLER][GFX9] Corrected decoding of GLOBAL/SCRATCH opcodes 2017-11-27 17:14:35 +00:00
GCNHazardRecognizer.cpp AMDGPU: Move hazard avoidance out of waitcnt pass. 2017-11-17 21:35:32 +00:00
GCNHazardRecognizer.h AMDGPU: Move hazard avoidance out of waitcnt pass. 2017-11-17 21:35:32 +00:00
GCNILPSched.cpp AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
GCNIterativeScheduler.cpp AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
GCNIterativeScheduler.h AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
GCNMinRegStrategy.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
GCNProcessors.td AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td 2017-11-10 20:01:58 +00:00
GCNRegPressure.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
GCNRegPressure.h [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-10 00:46:15 +00:00
GCNSchedStrategy.cpp [CodeGen] Rename DEBUG_TYPE to match passnames 2017-07-11 22:08:28 +00:00
GCNSchedStrategy.h fix typos in comments and error messges; NFC 2017-07-13 06:48:39 +00:00
LLVMBuild.txt AMDGPU: Add GlobalISel to required_libraries. 2017-01-28 18:13:08 +00:00
MIMGInstructions.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
Processors.td AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td 2017-11-10 20:01:58 +00:00
R600ClauseMergePass.cpp AMDGPU/R600: Initialize more passes 2017-08-02 22:19:45 +00:00
R600ControlFlowFinalizer.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-10 00:46:15 +00:00
R600Defines.h
R600EmitClauseMarkers.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
R600ExpandSpecialInstrs.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-10 00:46:15 +00:00
R600FrameLowering.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
R600FrameLowering.h AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
R600ISelLowering.cpp [AMDGPU] Fix pointer info for lowering load/store for r600 for amdgiz environment 2017-11-10 02:03:28 +00:00
R600ISelLowering.h Add DAG argument to canMergeStoresTo NFC. 2017-07-10 20:25:54 +00:00
R600InstrFormats.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
R600InstrInfo.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
R600InstrInfo.h [AMDGPU] Fix pointer info for pseudo source for r600 2017-11-10 01:53:24 +00:00
R600Instructions.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
R600Intrinsics.td AMDGPU: Make intrinsics speculatable 2017-05-02 16:57:44 +00:00
R600MachineFunctionInfo.cpp
R600MachineFunctionInfo.h
R600MachineScheduler.cpp [CodeGen] Rename DEBUG_TYPE to match passnames 2017-07-11 22:08:28 +00:00
R600MachineScheduler.h
R600OptimizeVectorRegisters.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-10 00:46:15 +00:00
R600Packetizer.cpp AMDGPU/R600: Initialize more passes 2017-08-02 22:19:45 +00:00
R600Processors.td AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td 2017-11-10 20:01:58 +00:00
R600RegisterInfo.cpp AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
R600RegisterInfo.h AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
R600RegisterInfo.td AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files 2017-07-29 03:44:07 +00:00
R600Schedule.td
R700Instructions.td
SIAnnotateControlFlow.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 00:47:13 +00:00
SIDebuggerInsertNops.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
SIDefines.h [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
SIFixSGPRCopies.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
SIFixVGPRCopies.cpp [AMDGPU] Add VGPR copies post regalloc fix pass 2017-01-24 17:46:17 +00:00
SIFixWWMLiveness.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
SIFoldOperands.cpp Remove unused variables 2017-10-15 05:35:02 +00:00
SIFrameLowering.cpp AMDGPU: Fix set but not used warnings related to AMDGPUAS 2017-11-01 19:12:38 +00:00
SIFrameLowering.h [AMDGPU] AMDPAL scratch buffer support 2017-09-29 09:49:35 +00:00
SIISelLowering.cpp [AMDGPU] Fix SITargetLowering::LowerCall for pointer info of byval argument 2017-11-22 16:13:35 +00:00
SIISelLowering.h AMDGPU: Don't use MUBUF vaddr if address may overflow 2017-11-15 00:45:43 +00:00
SIInsertSkips.cpp AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) 2017-10-24 10:27:13 +00:00
SIInsertWaitcnts.cpp AMDGPU: Move hazard avoidance out of waitcnt pass. 2017-11-17 21:35:32 +00:00
SIInsertWaits.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 00:47:13 +00:00
SIInstrFormats.td [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
SIInstrInfo.cpp AMDGPU: Consistently check for immediates in SIInstrInfo::FoldImmediate 2017-11-28 08:41:50 +00:00
SIInstrInfo.h AMDGPU: Replace list of SMEM buffer opcodes 2017-11-17 04:18:26 +00:00
SIInstrInfo.td AMDGPU: Select d16 loads into low component of register 2017-11-13 00:22:09 +00:00
SIInstructions.td [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
SIIntrinsics.td AMDGPU: Remove legacy export intrinsic 2017-04-04 16:34:39 +00:00
SILoadStoreOptimizer.cpp AMDGPU: Re-organize the outer loop of SILoadStoreOptimizer 2017-11-28 08:42:46 +00:00
SILowerControlFlow.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
SILowerI1Copies.cpp AMDGPU: VALU carry-in and v_cndmask condition cannot be EXEC 2017-09-29 15:37:31 +00:00
SIMachineFunctionInfo.cpp [AMDGPU] AMDPAL scratch buffer support 2017-09-29 09:49:35 +00:00
SIMachineFunctionInfo.h Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering 2017-11-08 01:01:31 +00:00
SIMachineScheduler.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
SIMachineScheduler.h AMDGPU/SI: Force exports at the end for SI scheduler 2017-07-25 20:36:58 +00:00
SIMemoryLegalizer.cpp AMDGPU: Handle non-temporal loads and stores 2017-09-07 17:14:54 +00:00
SIOptimizeExecMasking.cpp AMDGPU: Fix producing saveexec when the copy is spilled 2017-11-14 02:16:54 +00:00
SIOptimizeExecMaskingPreRA.cpp AMDGPU: Recompute scc liveness 2017-09-08 18:51:26 +00:00
SIPeepholeSDWA.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
SIRegisterInfo.cpp AMDGPU: Fix not converting d16 load/stores to offset 2017-11-13 23:24:26 +00:00
SIRegisterInfo.h [SystemZ] implement shouldCoalesce() 2017-09-29 14:31:39 +00:00
SIRegisterInfo.td AMDGPU: VALU carry-in and v_cndmask condition cannot be EXEC 2017-09-29 15:37:31 +00:00
SISchedule.td AMDGPU: Implement early ifcvt target hooks. 2017-01-25 04:25:02 +00:00
SIShrinkInstructions.cpp AMDGPU: Allow SIShrinkInstructions to fold FrameIndexes 2017-07-10 20:04:35 +00:00
SIWholeQuadMode.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
SMInstructions.td AMDGPU: Select s_buffer_load_dword with a non-constant SGPR offset 2017-10-31 21:06:42 +00:00
SOPInstructions.td AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic 2017-10-24 10:26:59 +00:00
VIInstrFormats.td
VIInstructions.td
VOP1Instructions.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
VOP2Instructions.td [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
VOP3Instructions.td [AMDGPU][MC][GFX9] Added v_interp_p2_f16 and v_interp_p2_legacy_f16 2017-11-24 15:37:14 +00:00
VOP3PInstructions.td AMDGPU: Add max-mix-insts subtarget feature 2017-10-25 07:00:51 +00:00
VOPCInstructions.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
VOPInstructions.td [AMDGPU][MC][GFX9][disassembler] Corrected decoding of op_sel_hi for v_mad_mix* 2017-11-17 15:15:40 +00:00