llvm-project/llvm/lib/Target/AMDGPU
Connor Abbott 92638ab625 [AMDGPU] Add support for Whole Wavefront Mode
Summary:
Whole Wavefront Wode (WWM) is similar to WQM, except that all of the
lanes are always enabled, regardless of control flow. This is required
for implementing wavefront reductions in non-uniform control flow, where
we need to use the inactive lanes to propagate intermediate results, so
they need to be enabled. We need to propagate WWM to uses (unless
they're explicitly marked as exact) so that they also propagate
intermediate results correctly. We do the analysis and exec mask munging
during the WQM pass, since there are interactions with WQM for things
that require both WQM and WWM. For simplicity, WWM is entirely
block-local -- blocks are never WWM on entry or exit of a block, and WWM
is not propagated to the block level.  This means that computations
involving WWM cannot involve control flow, but we only ever plan to use
WWM for a few limited purposes (none of which involve control flow)
anyways.

Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There
isn't yet a way to turn WWM off -- that will be added in a future
change.

Finally, it turns out that turning on inactive lanes causes a number of
problems with register allocation. While the best long-term solution
seems like teaching LLVM's register allocator about predication, for now
we need to add some hacks to prevent ourselves from getting into trouble
due to constraints that aren't currently expressed in LLVM. For the gory
details, see the comments at the top of SIFixWWMLiveness.cpp.

Reviewers: arsenm, nhaehnle, tpr

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D35524

llvm-svn: 310087
2017-08-04 18:36:52 +00:00
..
AsmParser [AMDGPU][MC] Enabled expressions as operands 2017-08-04 13:55:24 +00:00
Disassembler AMDGPU: Add instruction definitions for some scratch_* instructions 2017-07-21 15:36:16 +00:00
InstPrinter AMDGPU: Remove deadcode from AMDGPUInstPrinter 2017-07-29 03:56:53 +00:00
MCTargetDesc AMDGPU: Fix emitting encoded calls 2017-08-02 01:42:04 +00:00
TargetInfo fix trivial typos; NFC 2017-07-02 03:24:54 +00:00
Utils AMDGPU: Fix using SMRD instructions for argument loads in functions 2017-07-26 20:39:42 +00:00
AMDGPU.h [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
AMDGPU.td AMDGPU: Add instruction definitions for some scratch_* instructions 2017-07-21 15:36:16 +00:00
AMDGPUAliasAnalysis.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPUAliasAnalysis.h AMDGPU/R600: Fix amdgpu alias analysis pass. 2017-03-31 19:26:23 +00:00
AMDGPUAlwaysInlinePass.cpp [AMDGPU] Testing commit access only, no real change 2017-06-15 23:02:55 +00:00
AMDGPUAnnotateKernelFeatures.cpp AMDGPU: Annotate implicitarg.ptr usage 2017-07-28 15:52:08 +00:00
AMDGPUAnnotateUniformValues.cpp AMDGPU: Fix converting unanalyzable global loads to SMRD 2017-07-12 23:06:18 +00:00
AMDGPUArgumentUsageInfo.cpp AMDGPU: Fix implicitarg.ptr handling special inputs 2017-08-03 23:12:44 +00:00
AMDGPUArgumentUsageInfo.h [AMDGPU] Fixed MSVC build break 2017-08-04 10:53:07 +00:00
AMDGPUAsmPrinter.cpp AMDGPU: Restore using MRI to find highest used regs 2017-08-02 17:15:01 +00:00
AMDGPUAsmPrinter.h Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPUCallLowering.cpp AMDGPU: Pass special input registers to functions 2017-08-03 23:00:29 +00:00
AMDGPUCallLowering.h AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
AMDGPUCallingConv.td AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
AMDGPUCodeGenPrepare.cpp AMDGPU : Widen extending scalar loads to 32-bits. 2017-07-26 21:07:28 +00:00
AMDGPUFrameLowering.cpp [AMDGPU] Split R600/SI getFrameIndexReference and emit stack object offsets for SI 2017-03-10 19:39:07 +00:00
AMDGPUFrameLowering.h AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
AMDGPUGenRegisterBankInfo.def [GlobalISel] Make GlobalISel a non-optional library. 2017-08-03 21:52:25 +00:00
AMDGPUISelDAGToDAG.cpp AMDGPU: Add analysis pass for function argument info 2017-08-03 22:30:46 +00:00
AMDGPUISelLowering.cpp AMDGPU: Don't use report_fatal_error for unsupported call types 2017-08-03 23:32:41 +00:00
AMDGPUISelLowering.h AMDGPU: Don't use report_fatal_error for unsupported call types 2017-08-03 23:32:41 +00:00
AMDGPUInstrInfo.cpp AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
AMDGPUInstrInfo.h Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPUInstrInfo.td AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
AMDGPUInstructionSelector.cpp AMDGPU: Start adding offset fields to flat instructions 2017-06-12 15:55:58 +00:00
AMDGPUInstructionSelector.h Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPUInstructions.td AMDGPU: Start selecting global instructions 2017-07-29 01:03:53 +00:00
AMDGPUIntrinsicInfo.cpp Rename AttributeSet to AttributeList 2017-03-21 16:57:19 +00:00
AMDGPUIntrinsicInfo.h
AMDGPUIntrinsics.td AMDGPU: Remove legacy bfe intrinsics 2017-04-03 18:08:08 +00:00
AMDGPULegalizerInfo.cpp [GlobalISel] Make GlobalISel a non-optional library. 2017-08-03 21:52:25 +00:00
AMDGPULegalizerInfo.h Re-commit AMDGPU/GlobalISel: Add support for simple shaders 2017-01-30 21:56:46 +00:00
AMDGPULowerIntrinsics.cpp Extend memcpy expansion in Transform/Utils to handle wider operand types. 2017-07-07 02:00:06 +00:00
AMDGPUMCInstLower.cpp AMDGPU: Fix emitting encoded calls 2017-08-02 01:42:04 +00:00
AMDGPUMCInstLower.h Reapply "AMDGPU: Support using tablegened MC pseudo expansions" 2016-10-06 17:19:11 +00:00
AMDGPUMachineCFGStructurizer.cpp Guard print() functions only used by dump() functions. 2017-07-31 10:07:49 +00:00
AMDGPUMachineFunction.cpp AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
AMDGPUMachineFunction.h Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPUMachineModuleInfo.cpp AMDGPU: Implement memory model 2017-07-21 21:19:23 +00:00
AMDGPUMachineModuleInfo.h AMDGPU: Implement memory model 2017-07-21 21:19:23 +00:00
AMDGPUMacroFusion.cpp AMDGPU: Add macro fusion schedule DAG mutation 2017-07-06 20:57:05 +00:00
AMDGPUMacroFusion.h AMDGPU: Add macro fusion schedule DAG mutation 2017-07-06 20:57:05 +00:00
AMDGPUOpenCLImageTypeLoweringPass.cpp Use StringRef in Pass/PassManager APIs (NFC) 2016-10-01 02:56:57 +00:00
AMDGPUPTNote.h [AMDGPU] Restructure code object metadata creation 2017-03-22 22:32:22 +00:00
AMDGPUPromoteAlloca.cpp [AMDGPU] Fix for issue in alloca to vector promotion pass 2017-06-09 14:16:22 +00:00
AMDGPURegAsmNames.inc.cpp AMDGPU: Work around build special casing .inc files 2017-06-08 19:25:21 +00:00
AMDGPURegisterBankInfo.cpp [GlobalISel] Make GlobalISel a non-optional library. 2017-08-03 21:52:25 +00:00
AMDGPURegisterBankInfo.h Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
AMDGPURegisterBanks.td Re-commit AMDGPU/GlobalISel: Add support for simple shaders 2017-01-30 21:56:46 +00:00
AMDGPURegisterInfo.cpp AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
AMDGPURegisterInfo.h AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
AMDGPURegisterInfo.td AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files 2017-07-29 03:44:07 +00:00
AMDGPURewriteOutArguments.cpp [AMDGPU] Put a function used only inside assert() under NDEBUG. 2017-08-01 19:07:20 +00:00
AMDGPUSubtarget.cpp [GlobalISel] Make GlobalISel a non-optional library. 2017-08-03 21:52:25 +00:00
AMDGPUSubtarget.h AMDGPU: Annotate implicitarg.ptr usage 2017-07-28 15:52:08 +00:00
AMDGPUTargetMachine.cpp [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
AMDGPUTargetMachine.h AMDGPU: Remove error on calls for amdgcn 2017-08-03 23:24:05 +00:00
AMDGPUTargetObjectFile.cpp Move Object format code to lib/BinaryFormat. 2017-06-07 03:48:56 +00:00
AMDGPUTargetObjectFile.h [AMDGPU] Get address space mapping by target triple environment 2017-03-27 14:04:01 +00:00
AMDGPUTargetTransformInfo.cpp [LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. 2017-06-28 15:53:17 +00:00
AMDGPUTargetTransformInfo.h [LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. 2017-06-28 15:53:17 +00:00
AMDGPUUnifyDivergentExitNodes.cpp AMDGPU: Unify divergent function exits. 2017-03-24 19:52:05 +00:00
AMDGPUUnifyMetadata.cpp [AMDGPU] Turn AMDGPUUnifyMetadata back into module pass 2017-01-27 16:38:10 +00:00
AMDILCFGStructurizer.cpp Guard print() functions only used by dump() functions. 2017-07-31 10:07:49 +00:00
AMDKernelCodeT.h
BUFInstructions.td AMDGPU: Implement memory model 2017-07-21 21:19:23 +00:00
CMakeLists.txt [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
CaymanInstructions.td
DSInstructions.td [AMDGPU][MC] New syntax for ds_swizzle_b32 offset 2017-05-31 16:26:47 +00:00
EvergreenInstructions.td AMDGPU: Fix unnecessary ands when packing f16 vectors 2017-03-15 19:04:26 +00:00
FLATInstructions.td AMDGPU: Start selecting global instructions 2017-07-29 01:03:53 +00:00
GCNHazardRecognizer.cpp [AMDGPU] Add missing hazard for DPP-after-EXEC-write 2017-08-04 01:09:43 +00:00
GCNHazardRecognizer.h AMDGPU: Fix broken condition in hazard recognizer 2017-03-17 21:36:28 +00:00
GCNIterativeScheduler.cpp [CodeGen] Rename DEBUG_TYPE to match passnames 2017-07-11 22:08:28 +00:00
GCNIterativeScheduler.h [AMDGPU] Iterative scheduling infrastructure + minimal registry scheduler 2017-03-21 13:15:46 +00:00
GCNMinRegStrategy.cpp [CodeGen] Rename DEBUG_TYPE to match passnames 2017-07-11 22:08:28 +00:00
GCNRegPressure.cpp Implement LaneBitmask::getNumLanes and LaneBitmask::getHighestLane 2017-07-20 19:43:19 +00:00
GCNRegPressure.h [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker 2017-05-22 13:09:40 +00:00
GCNSchedStrategy.cpp [CodeGen] Rename DEBUG_TYPE to match passnames 2017-07-11 22:08:28 +00:00
GCNSchedStrategy.h fix typos in comments and error messges; NFC 2017-07-13 06:48:39 +00:00
LLVMBuild.txt AMDGPU: Add GlobalISel to required_libraries. 2017-01-28 18:13:08 +00:00
MIMGInstructions.td [AMDGPU] Fix latency of MIMG instructions 2017-07-04 14:43:38 +00:00
Processors.td AMDGPU: Whitespace fixes 2017-06-26 03:01:36 +00:00
R600ClauseMergePass.cpp AMDGPU/R600: Initialize more passes 2017-08-02 22:19:45 +00:00
R600ControlFlowFinalizer.cpp AMDGPU/R600: Initialize more passes 2017-08-02 22:19:45 +00:00
R600Defines.h
R600EmitClauseMarkers.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
R600ExpandSpecialInstrs.cpp AMDGPU/R600: Initialize more passes 2017-08-02 22:19:45 +00:00
R600FrameLowering.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
R600FrameLowering.h AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
R600ISelLowering.cpp Add DAG argument to canMergeStoresTo NFC. 2017-07-10 20:25:54 +00:00
R600ISelLowering.h Add DAG argument to canMergeStoresTo NFC. 2017-07-10 20:25:54 +00:00
R600InstrFormats.td
R600InstrInfo.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
R600InstrInfo.h Cyle -> Cycle; NFCI 2017-03-15 15:37:42 +00:00
R600Instructions.td AMDGPU: Remove deadcode from AMDGPUInstPrinter 2017-07-29 03:56:53 +00:00
R600Intrinsics.td AMDGPU: Make intrinsics speculatable 2017-05-02 16:57:44 +00:00
R600MachineFunctionInfo.cpp
R600MachineFunctionInfo.h
R600MachineScheduler.cpp [CodeGen] Rename DEBUG_TYPE to match passnames 2017-07-11 22:08:28 +00:00
R600MachineScheduler.h [AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). 2016-12-09 22:06:55 +00:00
R600OptimizeVectorRegisters.cpp AMDGPU/R600: Initialize more passes 2017-08-02 22:19:45 +00:00
R600Packetizer.cpp AMDGPU/R600: Initialize more passes 2017-08-02 22:19:45 +00:00
R600RegisterInfo.cpp AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
R600RegisterInfo.h AMDGPU: Start defining a calling convention 2017-05-17 21:56:25 +00:00
R600RegisterInfo.td AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files 2017-07-29 03:44:07 +00:00
R600Schedule.td
R700Instructions.td
SIAnnotateControlFlow.cpp Remove now useless trailing nullptr in StructType::get 2017-05-11 08:46:02 +00:00
SIDebuggerInsertNops.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
SIDefines.h AMDGPU: Introduce maybeAtomic instruction flag 2017-07-21 21:05:45 +00:00
SIFixControlFlowLiveIntervals.cpp Use StringRef in Pass/PassManager APIs (NFC) 2016-10-01 02:56:57 +00:00
SIFixSGPRCopies.cpp [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
SIFixVGPRCopies.cpp [AMDGPU] Add VGPR copies post regalloc fix pass 2017-01-24 17:46:17 +00:00
SIFixWWMLiveness.cpp [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
SIFoldOperands.cpp AMDGPU: Fix crash when folding immediates into multiple uses 2017-07-18 14:54:41 +00:00
SIFrameLowering.cpp AMDGPU: Pass special input registers to functions 2017-08-03 23:00:29 +00:00
SIFrameLowering.h AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
SIISelLowering.cpp [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
SIISelLowering.h AMDGPU: Pass special input registers to functions 2017-08-03 23:00:29 +00:00
SIInsertSkips.cpp AMDGPU: Rename SI_RETURN 2017-03-21 22:18:10 +00:00
SIInsertWaitcnts.cpp AMDGPU: Partially fix improper reliance on memoperands 2017-07-21 18:54:54 +00:00
SIInsertWaits.cpp AMDGPU: Make auto waitcnt before barrier a feature 2017-06-02 17:40:26 +00:00
SIInstrFormats.td AMDGPU: Introduce maybeAtomic instruction flag 2017-07-21 21:05:45 +00:00
SIInstrInfo.cpp [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
SIInstrInfo.h AMDGPU: Make areMemAccessesTriviallyDisjoint more aware of segment flat 2017-07-29 01:26:21 +00:00
SIInstrInfo.td AMDGPU: Remove leftover td file 2017-07-22 00:40:46 +00:00
SIInstructions.td [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
SIIntrinsics.td AMDGPU: Remove legacy export intrinsic 2017-04-04 16:34:39 +00:00
SILoadStoreOptimizer.cpp [LegacyPassManager] Remove TargetMachine constructors 2017-05-18 17:21:13 +00:00
SILowerControlFlow.cpp [AMDGPU] Preserve inverted bit in SI_IF in presence of SI_KILL 2017-08-04 06:58:42 +00:00
SILowerI1Copies.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
SIMachineFunctionInfo.cpp AMDGPU: Fix implicitarg.ptr handling special inputs 2017-08-03 23:12:44 +00:00
SIMachineFunctionInfo.h AMDGPU: Pass special input registers to functions 2017-08-03 23:00:29 +00:00
SIMachineScheduler.cpp AMDGPU/SI: Fix Depth and Height computation for SI scheduler 2017-07-25 20:37:03 +00:00
SIMachineScheduler.h AMDGPU/SI: Force exports at the end for SI scheduler 2017-07-25 20:36:58 +00:00
SIMemoryLegalizer.cpp AMDGPU: Implement memory model 2017-07-21 21:19:23 +00:00
SIOptimizeExecMasking.cpp [AMDGPU] Turn s_and_saveexec_b64 into s_and_b64 if result is unused 2017-08-01 23:44:35 +00:00
SIOptimizeExecMaskingPreRA.cpp [AMDGPU] Fix asan error after last commit 2017-08-02 01:18:57 +00:00
SIPeepholeSDWA.cpp [AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions 2017-06-27 15:02:23 +00:00
SIRegisterInfo.cpp AMDGPU: Pass special input registers to functions 2017-08-03 23:00:29 +00:00
SIRegisterInfo.h AMDGPU: Pass special input registers to functions 2017-08-03 23:00:29 +00:00
SIRegisterInfo.td AMDGPU: Initial implementation of calls 2017-08-01 19:54:18 +00:00
SISchedule.td AMDGPU: Implement early ifcvt target hooks. 2017-01-25 04:25:02 +00:00
SIShrinkInstructions.cpp AMDGPU: Allow SIShrinkInstructions to fold FrameIndexes 2017-07-10 20:04:35 +00:00
SIWholeQuadMode.cpp [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
SMInstructions.td AMDGPUAnnotateUniformValue should always treat volatile loads as divergent 2017-06-02 15:25:52 +00:00
SOPInstructions.td Resubmit r303859 with test fixed. 2017-05-26 20:38:26 +00:00
VIInstrFormats.td [AMDGPU] Refactor VOP1 and VOP2 instruction TD definitions 2016-09-23 09:08:07 +00:00
VIInstructions.td AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
VOP1Instructions.td [AMDGPU] SDWA: merge VI and GFX9 pseudo instructions 2017-06-21 08:53:38 +00:00
VOP2Instructions.td AMDGPU: Add encoding for carryless add/sub instructions 2017-07-20 17:42:47 +00:00
VOP3Instructions.td [AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier 2017-07-21 13:54:11 +00:00
VOP3PInstructions.td [AMDGPU][MC] Added missing VOP3P opcodes 2017-07-18 09:24:10 +00:00
VOPCInstructions.td [AMDGPU] resubmit r308179: CodeGen: check dst operand type to determine if omod is supported for VOP3 instructions 2017-07-18 14:23:26 +00:00
VOPInstructions.td [AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier 2017-07-21 13:54:11 +00:00