llvm-project/llvm/lib/Target/AMDGPU
Tim Corringham 6c6d5e24cd AMDGPU: fix missing s_waitcnt
Summary:
The pass that inserts s_waitcnt instructions where needed propagated
info used to track dependencies for each block by iterating over the
predecessor blocks. The iteration was terminated when a predecessor
that had not yet been processed was encountered. Any info in blocks
later in the list was therefore not processed, leading to the
possiblility of a required s_waitcnt not being inserted.

The fix is simply to change the "break" to "continue" for the
relevant loops, so that all visited blocks are processed. This
is likely what was intended when the code was written.

There is no test case provided for this fix because:
1) the only example that reproduces this is large and resistant to
being reduced
2) the change is trivial

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D40544

llvm-svn: 319651
2017-12-04 12:30:49 +00:00
..
AsmParser [AMDGPU][MC][GFX9] Added support of 'inst_offset' modifier for compatibility with SP3 2017-11-24 13:22:38 +00:00
Disassembler [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
InstPrinter [AMDGPU][MC][DISASSEMBLER][GFX9] Corrected decoding of GLOBAL/SCRATCH opcodes 2017-11-27 17:14:35 +00:00
MCTargetDesc [AMDGPU] Emit metadata for hidden arguments for kernel enqueue 2017-10-30 14:30:28 +00:00
TargetInfo Add backend name to Target to enable runtime info to be fed back into TableGen 2017-11-15 23:55:44 +00:00
Utils AMDGPU: Fix set but not used warnings related to AMDGPUAS 2017-11-01 19:12:38 +00:00
AMDGPU.h [AMDGPU] Clean up symbols in the global namespace. 2017-10-31 23:21:30 +00:00
AMDGPU.td AMDGPU: Select DS insts without m0 initialization 2017-11-29 00:55:57 +00:00
AMDGPUAliasAnalysis.cpp [AMDGPU] calling conventions for AMDPAL OS type 2017-09-29 09:51:22 +00:00
AMDGPUAliasAnalysis.h [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
AMDGPUAlwaysInlinePass.cpp AMDGPU: Add option to stress calls 2017-09-21 07:00:48 +00:00
AMDGPUAnnotateKernelFeatures.cpp [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC). 2017-08-31 21:56:16 +00:00
AMDGPUAnnotateUniformValues.cpp AMDGPU: Fix converting unanalyzable global loads to SMRD 2017-07-12 23:06:18 +00:00
AMDGPUArgumentUsageInfo.cpp [CodeGen] Rename functions PrintReg* to printReg* 2017-11-28 12:42:37 +00:00
AMDGPUArgumentUsageInfo.h [AMDGPU] Fixed MSVC build break 2017-08-04 10:53:07 +00:00
AMDGPUAsmPrinter.cpp AMDGPU: Add num spilled s/vgprs to metadata 2017-11-28 17:51:08 +00:00
AMDGPUAsmPrinter.h AMDGPU: Error on stack size overflow 2017-11-14 20:33:14 +00:00
AMDGPUCallLowering.cpp AMDGPU: Pass special input registers to functions 2017-08-03 23:00:29 +00:00
AMDGPUCallLowering.h AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
AMDGPUCallingConv.td AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
AMDGPUCodeGenPrepare.cpp [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag 2017-11-06 16:27:15 +00:00
AMDGPUFrameLowering.cpp [AMDGPU] Split R600/SI getFrameIndexReference and emit stack object offsets for SI 2017-03-10 19:39:07 +00:00
AMDGPUFrameLowering.h Move TargetFrameLowering.h to CodeGen where it's implemented 2017-11-03 22:32:11 +00:00
AMDGPUGenRegisterBankInfo.def [GlobalISel] Make GlobalISel a non-optional library. 2017-08-03 21:52:25 +00:00
AMDGPUISelDAGToDAG.cpp AMDGPU: Use gfx9 carry-less add/sub instructions 2017-11-30 22:51:26 +00:00
AMDGPUISelLowering.cpp DAG: Add nuw when splitting loads and stores 2017-11-29 01:25:12 +00:00
AMDGPUISelLowering.h [AMDGPU] Add custom lowering for llvm.log{,10}.{f16,f32} intrinsics 2017-11-27 13:26:38 +00:00
AMDGPUInline.cpp [AMDGPU] Port of HSAIL inliner 2017-09-20 04:25:58 +00:00
AMDGPUInstrInfo.cpp [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
AMDGPUInstrInfo.h Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering 2017-11-08 01:01:31 +00:00
AMDGPUInstrInfo.td Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ. 2017-10-12 19:37:14 +00:00
AMDGPUInstructionSelector.cpp [globalisel][tablegen] Generate rule coverage and use it to identify untested rules 2017-11-16 00:46:35 +00:00
AMDGPUInstructionSelector.h [globalisel][tablegen] Generate rule coverage and use it to identify untested rules 2017-11-16 00:46:35 +00:00
AMDGPUInstructions.td AMDGPU: Select d16 loads into low component of register 2017-11-13 00:22:09 +00:00
AMDGPUIntrinsicInfo.cpp Rename AttributeSet to AttributeList 2017-03-21 16:57:19 +00:00
AMDGPUIntrinsicInfo.h
AMDGPUIntrinsics.td AMDGPU: Remove legacy bfe intrinsics 2017-04-03 18:08:08 +00:00
AMDGPULegalizerInfo.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
AMDGPULegalizerInfo.h Re-commit AMDGPU/GlobalISel: Add support for simple shaders 2017-01-30 21:56:46 +00:00
AMDGPULibCalls.cpp Make helpers static. NFC. 2017-11-24 14:55:41 +00:00
AMDGPULibFunc.cpp [AMDGPU] Remove hardcoded address space value from AMDGPULibFunc 2017-11-04 17:37:43 +00:00
AMDGPULibFunc.h [AMDGPU] Remove hardcoded address space value from AMDGPULibFunc 2017-11-04 17:37:43 +00:00
AMDGPULowerIntrinsics.cpp Extend memcpy expansion in Transform/Utils to handle wider operand types. 2017-07-07 02:00:06 +00:00
AMDGPUMCInstLower.cpp Fix AMDGPU build issue 2017-10-11 23:53:36 +00:00
AMDGPUMCInstLower.h
AMDGPUMachineCFGStructurizer.cpp [CodeGen] Rename functions PrintReg* to printReg* 2017-11-28 12:42:37 +00:00
AMDGPUMachineFunction.cpp AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
AMDGPUMachineFunction.h Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPUMachineModuleInfo.cpp AMDGPU: Implement memory model 2017-07-21 21:19:23 +00:00
AMDGPUMachineModuleInfo.h AMDGPU: Handle more than one memory operand in SIMemoryLegalizer 2017-09-07 16:14:21 +00:00
AMDGPUMacroFusion.cpp AMDGPU: Add macro fusion schedule DAG mutation 2017-07-06 20:57:05 +00:00
AMDGPUMacroFusion.h AMDGPU: Add macro fusion schedule DAG mutation 2017-07-06 20:57:05 +00:00
AMDGPUOpenCLEnqueuedBlockLowering.cpp AMDGPU: Fix set but not used warnings related to AMDGPUAS 2017-11-01 19:12:38 +00:00
AMDGPUOpenCLImageTypeLoweringPass.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
AMDGPUPTNote.h AMDGPU/NFC: Move AMDGPU specific note types to ELF.h 2017-10-12 18:59:54 +00:00
AMDGPUPromoteAlloca.cpp AMDGPU: Fix assert on alloca of array of struct 2017-09-14 18:02:29 +00:00
AMDGPURegAsmNames.inc.cpp AMDGPU: Work around build special casing .inc files 2017-06-08 19:25:21 +00:00
AMDGPURegisterBankInfo.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
AMDGPURegisterBankInfo.h Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPURegisterBanks.td Re-commit AMDGPU/GlobalISel: Add support for simple shaders 2017-01-30 21:56:46 +00:00
AMDGPURegisterInfo.cpp AMDGPU: Make frame register caller preserved 2017-09-14 17:14:57 +00:00
AMDGPURegisterInfo.h AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
AMDGPURegisterInfo.td AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files 2017-07-29 03:44:07 +00:00
AMDGPURewriteOutArguments.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
AMDGPUSubtarget.cpp AMDGPU: Don't use MUBUF vaddr if address may overflow 2017-11-15 00:45:43 +00:00
AMDGPUSubtarget.h AMDGPU: Allow negative MUBUF vaddr for gfx9 2017-11-30 00:52:40 +00:00
AMDGPUTargetMachine.cpp AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
AMDGPUTargetMachine.h AMDGPU: Enable IPRA 2017-11-28 23:40:12 +00:00
AMDGPUTargetObjectFile.cpp AMDGPU: Fix set but not used warnings related to AMDGPUAS 2017-11-01 19:12:38 +00:00
AMDGPUTargetObjectFile.h [AMDGPU] Get address space mapping by target triple environment 2017-03-27 14:04:01 +00:00
AMDGPUTargetTransformInfo.cpp [AMDGPU] calling conventions for AMDPAL OS type 2017-09-29 09:51:22 +00:00
AMDGPUTargetTransformInfo.h [AMDGPU] Port of HSAIL inliner 2017-09-20 04:25:58 +00:00
AMDGPUUnifyDivergentExitNodes.cpp [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). 2017-10-17 21:27:42 +00:00
AMDGPUUnifyMetadata.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-10 00:46:15 +00:00
AMDILCFGStructurizer.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
AMDKernelCodeT.h [AMDGPU] Revert r310429 changes in AMDKernelCodeT.h which broke some build bots. 2017-08-09 00:06:29 +00:00
BUFInstructions.td AMDGPU: Select d16 loads into low component of register 2017-11-13 00:22:09 +00:00
CMakeLists.txt AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
CaymanInstructions.td [CodeGen] Print register names in lowercase in both MIR and debug output 2017-11-28 17:15:09 +00:00
DSInstructions.td AMDGPU: Select DS insts without m0 initialization 2017-11-29 00:55:57 +00:00
EvergreenInstructions.td [CodeGen] Print register names in lowercase in both MIR and debug output 2017-11-28 17:15:09 +00:00
FLATInstructions.td [AMDGPU][MC][DISASSEMBLER][GFX9] Corrected decoding of GLOBAL/SCRATCH opcodes 2017-11-27 17:14:35 +00:00
GCNHazardRecognizer.cpp AMDGPU: Move hazard avoidance out of waitcnt pass. 2017-11-17 21:35:32 +00:00
GCNHazardRecognizer.h AMDGPU: Move hazard avoidance out of waitcnt pass. 2017-11-17 21:35:32 +00:00
GCNILPSched.cpp AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
GCNIterativeScheduler.cpp AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
GCNIterativeScheduler.h AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental) 2017-11-20 14:35:53 +00:00
GCNMinRegStrategy.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 23:53:55 +00:00
GCNProcessors.td AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td 2017-11-10 20:01:58 +00:00
GCNRegPressure.cpp [CodeGen] Rename functions PrintReg* to printReg* 2017-11-28 12:42:37 +00:00
GCNRegPressure.h [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-10 00:46:15 +00:00
GCNSchedStrategy.cpp [CodeGen] Rename DEBUG_TYPE to match passnames 2017-07-11 22:08:28 +00:00
GCNSchedStrategy.h fix typos in comments and error messges; NFC 2017-07-13 06:48:39 +00:00
LLVMBuild.txt AMDGPU: Add GlobalISel to required_libraries. 2017-01-28 18:13:08 +00:00
MIMGInstructions.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
Processors.td AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td 2017-11-10 20:01:58 +00:00
R600ClauseMergePass.cpp AMDGPU/R600: Initialize more passes 2017-08-02 22:19:45 +00:00
R600ControlFlowFinalizer.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-10 00:46:15 +00:00
R600Defines.h
R600EmitClauseMarkers.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
R600ExpandSpecialInstrs.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-10 00:46:15 +00:00
R600FrameLowering.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
R600FrameLowering.h AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
R600ISelLowering.cpp [AMDGPU] Fix pointer info for lowering load/store for r600 for amdgiz environment 2017-11-10 02:03:28 +00:00
R600ISelLowering.h Add DAG argument to canMergeStoresTo NFC. 2017-07-10 20:25:54 +00:00
R600InstrFormats.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
R600InstrInfo.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
R600InstrInfo.h [AMDGPU] Fix pointer info for pseudo source for r600 2017-11-10 01:53:24 +00:00
R600Instructions.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
R600Intrinsics.td AMDGPU: Make intrinsics speculatable 2017-05-02 16:57:44 +00:00
R600MachineFunctionInfo.cpp
R600MachineFunctionInfo.h
R600MachineScheduler.cpp [CodeGen] Rename DEBUG_TYPE to match passnames 2017-07-11 22:08:28 +00:00
R600MachineScheduler.h
R600OptimizeVectorRegisters.cpp [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output 2017-11-30 12:12:19 +00:00
R600Packetizer.cpp AMDGPU/R600: Initialize more passes 2017-08-02 22:19:45 +00:00
R600Processors.td AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td 2017-11-10 20:01:58 +00:00
R600RegisterInfo.cpp AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
R600RegisterInfo.h AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
R600RegisterInfo.td AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files 2017-07-29 03:44:07 +00:00
R600Schedule.td
R700Instructions.td
SIAnnotateControlFlow.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 00:47:13 +00:00
SIDebuggerInsertNops.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
SIDefines.h [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
SIFixSGPRCopies.cpp [AMDGPU] SiFixSGPRCopies should not modify non-divergent PHI 2017-12-01 11:56:34 +00:00
SIFixVGPRCopies.cpp [AMDGPU] Add VGPR copies post regalloc fix pass 2017-01-24 17:46:17 +00:00
SIFixWWMLiveness.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
SIFoldOperands.cpp [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output 2017-11-30 12:12:19 +00:00
SIFrameLowering.cpp AMDGPU: Fix set but not used warnings related to AMDGPUAS 2017-11-01 19:12:38 +00:00
SIFrameLowering.h [AMDGPU] AMDPAL scratch buffer support 2017-09-29 09:49:35 +00:00
SIISelLowering.cpp AMDGPU: Use gfx9 carry-less add/sub instructions 2017-11-30 22:51:26 +00:00
SIISelLowering.h AMDGPU: Don't use MUBUF vaddr if address may overflow 2017-11-15 00:45:43 +00:00
SIInsertSkips.cpp AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) 2017-10-24 10:27:13 +00:00
SIInsertWaitcnts.cpp AMDGPU: fix missing s_waitcnt 2017-12-04 12:30:49 +00:00
SIInsertWaits.cpp [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-08-08 00:47:13 +00:00
SIInstrFormats.td [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev} 2017-11-20 18:24:21 +00:00
SIInstrInfo.cpp AMDGPU: Use carry-less adds in FI elimination 2017-11-30 23:42:30 +00:00
SIInstrInfo.h AMDGPU: Use gfx9 carry-less add/sub instructions 2017-11-30 22:51:26 +00:00
SIInstrInfo.td AMDGPU: Select DS insts without m0 initialization 2017-11-29 00:55:57 +00:00
SIInstructions.td AMDGPU: Use gfx9 carry-less add/sub instructions 2017-11-30 22:51:26 +00:00
SIIntrinsics.td AMDGPU: Remove legacy export intrinsic 2017-04-04 16:34:39 +00:00
SILoadStoreOptimizer.cpp AMDGPU: Use gfx9 carry-less add/sub instructions 2017-11-30 22:51:26 +00:00
SILowerControlFlow.cpp [CodeGen] Print register names in lowercase in both MIR and debug output 2017-11-28 17:15:09 +00:00
SILowerI1Copies.cpp AMDGPU: VALU carry-in and v_cndmask condition cannot be EXEC 2017-09-29 15:37:31 +00:00
SIMachineFunctionInfo.cpp [AMDGPU] AMDPAL scratch buffer support 2017-09-29 09:49:35 +00:00
SIMachineFunctionInfo.h Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering 2017-11-08 01:01:31 +00:00
SIMachineScheduler.cpp [CodeGen] Rename functions PrintReg* to printReg* 2017-11-28 12:42:37 +00:00
SIMachineScheduler.h AMDGPU/SI: Force exports at the end for SI scheduler 2017-07-25 20:36:58 +00:00
SIMemoryLegalizer.cpp AMDGPU: Handle non-temporal loads and stores 2017-09-07 17:14:54 +00:00
SIOptimizeExecMasking.cpp [CodeGen] Rename functions PrintReg* to printReg* 2017-11-28 12:42:37 +00:00
SIOptimizeExecMaskingPreRA.cpp AMDGPU: Recompute scc liveness 2017-09-08 18:51:26 +00:00
SIPeepholeSDWA.cpp [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output 2017-11-30 12:12:19 +00:00
SIRegisterInfo.cpp AMDGPU: Use carry-less adds in FI elimination 2017-11-30 23:42:30 +00:00
SIRegisterInfo.h [SystemZ] implement shouldCoalesce() 2017-09-29 14:31:39 +00:00
SIRegisterInfo.td AMDGPU: VALU carry-in and v_cndmask condition cannot be EXEC 2017-09-29 15:37:31 +00:00
SISchedule.td AMDGPU: Implement early ifcvt target hooks. 2017-01-25 04:25:02 +00:00
SIShrinkInstructions.cpp AMDGPU: Allow SIShrinkInstructions to fold FrameIndexes 2017-07-10 20:04:35 +00:00
SIWholeQuadMode.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
SMInstructions.td AMDGPU: Select s_buffer_load_dword with a non-constant SGPR offset 2017-10-31 21:06:42 +00:00
SOPInstructions.td AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic 2017-10-24 10:26:59 +00:00
VIInstrFormats.td
VIInstructions.td
VOP1Instructions.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
VOP2Instructions.td [AMDGPU][MC][GFX9] Corrected mapping of GFX9 v_add/sub/subrev_u32 2017-11-29 13:33:40 +00:00
VOP3Instructions.td [AMDGPU][MC][GFX9] Added v_interp_p2_f16 and v_interp_p2_legacy_f16 2017-11-24 15:37:14 +00:00
VOP3PInstructions.td AMDGPU: Add max-mix-insts subtarget feature 2017-10-25 07:00:51 +00:00
VOPCInstructions.td AMDGPU: Remove global isGCN predicates 2017-10-03 00:06:41 +00:00
VOPInstructions.td [AMDGPU][MC][GFX9][disassembler] Corrected decoding of op_sel_hi for v_mad_mix* 2017-11-17 15:15:40 +00:00