llvm-project

History

Connor Abbott 92638ab625 [AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 llvm-svn: 310087		2017-08-04 18:36:52 +00:00
..
AsmParser	[AMDGPU][MC] Enabled expressions as operands	2017-08-04 13:55:24 +00:00
Disassembler	AMDGPU: Add instruction definitions for some scratch_* instructions	2017-07-21 15:36:16 +00:00
InstPrinter	AMDGPU: Remove deadcode from AMDGPUInstPrinter	2017-07-29 03:56:53 +00:00
MCTargetDesc	AMDGPU: Fix emitting encoded calls	2017-08-02 01:42:04 +00:00
TargetInfo	fix trivial typos; NFC	2017-07-02 03:24:54 +00:00
Utils	AMDGPU: Fix using SMRD instructions for argument loads in functions	2017-07-26 20:39:42 +00:00
AMDGPU.h	[AMDGPU] Add support for Whole Wavefront Mode	2017-08-04 18:36:52 +00:00
AMDGPU.td	AMDGPU: Add instruction definitions for some scratch_* instructions	2017-07-21 15:36:16 +00:00
AMDGPUAliasAnalysis.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
AMDGPUAliasAnalysis.h	AMDGPU/R600: Fix amdgpu alias analysis pass.	2017-03-31 19:26:23 +00:00
AMDGPUAlwaysInlinePass.cpp	[AMDGPU] Testing commit access only, no real change	2017-06-15 23:02:55 +00:00
AMDGPUAnnotateKernelFeatures.cpp	AMDGPU: Annotate implicitarg.ptr usage	2017-07-28 15:52:08 +00:00
AMDGPUAnnotateUniformValues.cpp	AMDGPU: Fix converting unanalyzable global loads to SMRD	2017-07-12 23:06:18 +00:00
AMDGPUArgumentUsageInfo.cpp	AMDGPU: Fix implicitarg.ptr handling special inputs	2017-08-03 23:12:44 +00:00
AMDGPUArgumentUsageInfo.h	[AMDGPU] Fixed MSVC build break	2017-08-04 10:53:07 +00:00
AMDGPUAsmPrinter.cpp	AMDGPU: Restore using MRI to find highest used regs	2017-08-02 17:15:01 +00:00
AMDGPUAsmPrinter.h	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
AMDGPUCallLowering.cpp	AMDGPU: Pass special input registers to functions	2017-08-03 23:00:29 +00:00
AMDGPUCallLowering.h	AMDGPU: Start defining a calling convention	2017-05-17 21:56:25 +00:00
AMDGPUCallingConv.td	AMDGPU: Initial implementation of calls	2017-08-01 19:54:18 +00:00
AMDGPUCodeGenPrepare.cpp	AMDGPU : Widen extending scalar loads to 32-bits.	2017-07-26 21:07:28 +00:00
AMDGPUFrameLowering.cpp	[AMDGPU] Split R600/SI getFrameIndexReference and emit stack object offsets for SI	2017-03-10 19:39:07 +00:00
AMDGPUFrameLowering.h	AMDGPU: Initial implementation of calls	2017-08-01 19:54:18 +00:00
AMDGPUGenRegisterBankInfo.def	[GlobalISel] Make GlobalISel a non-optional library.	2017-08-03 21:52:25 +00:00
AMDGPUISelDAGToDAG.cpp	AMDGPU: Add analysis pass for function argument info	2017-08-03 22:30:46 +00:00
AMDGPUISelLowering.cpp	AMDGPU: Don't use report_fatal_error for unsupported call types	2017-08-03 23:32:41 +00:00
AMDGPUISelLowering.h	AMDGPU: Don't use report_fatal_error for unsupported call types	2017-08-03 23:32:41 +00:00
AMDGPUInstrInfo.cpp	AMDGPU: Initial implementation of calls	2017-08-01 19:54:18 +00:00
AMDGPUInstrInfo.h	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
AMDGPUInstrInfo.td	AMDGPU: Initial implementation of calls	2017-08-01 19:54:18 +00:00
AMDGPUInstructionSelector.cpp	AMDGPU: Start adding offset fields to flat instructions	2017-06-12 15:55:58 +00:00
AMDGPUInstructionSelector.h	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
AMDGPUInstructions.td	AMDGPU: Start selecting global instructions	2017-07-29 01:03:53 +00:00
AMDGPUIntrinsicInfo.cpp	Rename AttributeSet to AttributeList	2017-03-21 16:57:19 +00:00
AMDGPUIntrinsicInfo.h	…
AMDGPUIntrinsics.td	AMDGPU: Remove legacy bfe intrinsics	2017-04-03 18:08:08 +00:00
AMDGPULegalizerInfo.cpp	[GlobalISel] Make GlobalISel a non-optional library.	2017-08-03 21:52:25 +00:00
AMDGPULegalizerInfo.h	Re-commit AMDGPU/GlobalISel: Add support for simple shaders	2017-01-30 21:56:46 +00:00
AMDGPULowerIntrinsics.cpp	Extend memcpy expansion in Transform/Utils to handle wider operand types.	2017-07-07 02:00:06 +00:00
AMDGPUMCInstLower.cpp	AMDGPU: Fix emitting encoded calls	2017-08-02 01:42:04 +00:00
AMDGPUMCInstLower.h	Reapply "AMDGPU: Support using tablegened MC pseudo expansions"	2016-10-06 17:19:11 +00:00
AMDGPUMachineCFGStructurizer.cpp	Guard print() functions only used by dump() functions.	2017-07-31 10:07:49 +00:00
AMDGPUMachineFunction.cpp	AMDGPU: Start defining a calling convention	2017-05-17 21:56:25 +00:00
AMDGPUMachineFunction.h	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
AMDGPUMachineModuleInfo.cpp	AMDGPU: Implement memory model	2017-07-21 21:19:23 +00:00
AMDGPUMachineModuleInfo.h	AMDGPU: Implement memory model	2017-07-21 21:19:23 +00:00
AMDGPUMacroFusion.cpp	AMDGPU: Add macro fusion schedule DAG mutation	2017-07-06 20:57:05 +00:00
AMDGPUMacroFusion.h	AMDGPU: Add macro fusion schedule DAG mutation	2017-07-06 20:57:05 +00:00
AMDGPUOpenCLImageTypeLoweringPass.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
AMDGPUPTNote.h	[AMDGPU] Restructure code object metadata creation	2017-03-22 22:32:22 +00:00
AMDGPUPromoteAlloca.cpp	[AMDGPU] Fix for issue in alloca to vector promotion pass	2017-06-09 14:16:22 +00:00
AMDGPURegAsmNames.inc.cpp	AMDGPU: Work around build special casing .inc files	2017-06-08 19:25:21 +00:00
AMDGPURegisterBankInfo.cpp	[GlobalISel] Make GlobalISel a non-optional library.	2017-08-03 21:52:25 +00:00
AMDGPURegisterBankInfo.h	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
AMDGPURegisterBanks.td	Re-commit AMDGPU/GlobalISel: Add support for simple shaders	2017-01-30 21:56:46 +00:00
AMDGPURegisterInfo.cpp	AMDGPU: Initial implementation of calls	2017-08-01 19:54:18 +00:00
AMDGPURegisterInfo.h	AMDGPU: Start defining a calling convention	2017-05-17 21:56:25 +00:00
AMDGPURegisterInfo.td	AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files	2017-07-29 03:44:07 +00:00
AMDGPURewriteOutArguments.cpp	[AMDGPU] Put a function used only inside assert() under NDEBUG.	2017-08-01 19:07:20 +00:00
AMDGPUSubtarget.cpp	[GlobalISel] Make GlobalISel a non-optional library.	2017-08-03 21:52:25 +00:00
AMDGPUSubtarget.h	AMDGPU: Annotate implicitarg.ptr usage	2017-07-28 15:52:08 +00:00
AMDGPUTargetMachine.cpp	[AMDGPU] Add support for Whole Wavefront Mode	2017-08-04 18:36:52 +00:00
AMDGPUTargetMachine.h	AMDGPU: Remove error on calls for amdgcn	2017-08-03 23:24:05 +00:00
AMDGPUTargetObjectFile.cpp	Move Object format code to lib/BinaryFormat.	2017-06-07 03:48:56 +00:00
AMDGPUTargetObjectFile.h	[AMDGPU] Get address space mapping by target triple environment	2017-03-27 14:04:01 +00:00
AMDGPUTargetTransformInfo.cpp	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.	2017-06-28 15:53:17 +00:00
AMDGPUTargetTransformInfo.h	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.	2017-06-28 15:53:17 +00:00
AMDGPUUnifyDivergentExitNodes.cpp	AMDGPU: Unify divergent function exits.	2017-03-24 19:52:05 +00:00
AMDGPUUnifyMetadata.cpp	[AMDGPU] Turn AMDGPUUnifyMetadata back into module pass	2017-01-27 16:38:10 +00:00
AMDILCFGStructurizer.cpp	Guard print() functions only used by dump() functions.	2017-07-31 10:07:49 +00:00
AMDKernelCodeT.h	…
BUFInstructions.td	AMDGPU: Implement memory model	2017-07-21 21:19:23 +00:00
CMakeLists.txt	[AMDGPU] Add support for Whole Wavefront Mode	2017-08-04 18:36:52 +00:00
CaymanInstructions.td	…
DSInstructions.td	[AMDGPU][MC] New syntax for ds_swizzle_b32 offset	2017-05-31 16:26:47 +00:00
EvergreenInstructions.td	AMDGPU: Fix unnecessary ands when packing f16 vectors	2017-03-15 19:04:26 +00:00
FLATInstructions.td	AMDGPU: Start selecting global instructions	2017-07-29 01:03:53 +00:00
GCNHazardRecognizer.cpp	[AMDGPU] Add missing hazard for DPP-after-EXEC-write	2017-08-04 01:09:43 +00:00
GCNHazardRecognizer.h	AMDGPU: Fix broken condition in hazard recognizer	2017-03-17 21:36:28 +00:00
GCNIterativeScheduler.cpp	[CodeGen] Rename DEBUG_TYPE to match passnames	2017-07-11 22:08:28 +00:00
GCNIterativeScheduler.h	[AMDGPU] Iterative scheduling infrastructure + minimal registry scheduler	2017-03-21 13:15:46 +00:00
GCNMinRegStrategy.cpp	[CodeGen] Rename DEBUG_TYPE to match passnames	2017-07-11 22:08:28 +00:00
GCNRegPressure.cpp	Implement LaneBitmask::getNumLanes and LaneBitmask::getHighestLane	2017-07-20 19:43:19 +00:00
GCNRegPressure.h	[AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker	2017-05-22 13:09:40 +00:00
GCNSchedStrategy.cpp	[CodeGen] Rename DEBUG_TYPE to match passnames	2017-07-11 22:08:28 +00:00
GCNSchedStrategy.h	fix typos in comments and error messges; NFC	2017-07-13 06:48:39 +00:00
LLVMBuild.txt	AMDGPU: Add GlobalISel to required_libraries.	2017-01-28 18:13:08 +00:00
MIMGInstructions.td	[AMDGPU] Fix latency of MIMG instructions	2017-07-04 14:43:38 +00:00
Processors.td	AMDGPU: Whitespace fixes	2017-06-26 03:01:36 +00:00
R600ClauseMergePass.cpp	AMDGPU/R600: Initialize more passes	2017-08-02 22:19:45 +00:00
R600ControlFlowFinalizer.cpp	AMDGPU/R600: Initialize more passes	2017-08-02 22:19:45 +00:00
R600Defines.h	…
R600EmitClauseMarkers.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
R600ExpandSpecialInstrs.cpp	AMDGPU/R600: Initialize more passes	2017-08-02 22:19:45 +00:00
R600FrameLowering.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
R600FrameLowering.h	AMDGPU: Initial implementation of calls	2017-08-01 19:54:18 +00:00
R600ISelLowering.cpp	Add DAG argument to canMergeStoresTo NFC.	2017-07-10 20:25:54 +00:00
R600ISelLowering.h	Add DAG argument to canMergeStoresTo NFC.	2017-07-10 20:25:54 +00:00
R600InstrFormats.td	…
R600InstrInfo.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
R600InstrInfo.h	Cyle -> Cycle; NFCI	2017-03-15 15:37:42 +00:00
R600Instructions.td	AMDGPU: Remove deadcode from AMDGPUInstPrinter	2017-07-29 03:56:53 +00:00
R600Intrinsics.td	AMDGPU: Make intrinsics speculatable	2017-05-02 16:57:44 +00:00
R600MachineFunctionInfo.cpp	…
R600MachineFunctionInfo.h	…
R600MachineScheduler.cpp	[CodeGen] Rename DEBUG_TYPE to match passnames	2017-07-11 22:08:28 +00:00
R600MachineScheduler.h	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
R600OptimizeVectorRegisters.cpp	AMDGPU/R600: Initialize more passes	2017-08-02 22:19:45 +00:00
R600Packetizer.cpp	AMDGPU/R600: Initialize more passes	2017-08-02 22:19:45 +00:00
R600RegisterInfo.cpp	AMDGPU: Start defining a calling convention	2017-05-17 21:56:25 +00:00
R600RegisterInfo.h	AMDGPU: Start defining a calling convention	2017-05-17 21:56:25 +00:00
R600RegisterInfo.td	AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files	2017-07-29 03:44:07 +00:00
R600Schedule.td	…
R700Instructions.td	…
SIAnnotateControlFlow.cpp	Remove now useless trailing nullptr in StructType::get	2017-05-11 08:46:02 +00:00
SIDebuggerInsertNops.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
SIDefines.h	AMDGPU: Introduce maybeAtomic instruction flag	2017-07-21 21:05:45 +00:00
SIFixControlFlowLiveIntervals.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
SIFixSGPRCopies.cpp	[AMDGPU] Add support for Whole Wavefront Mode	2017-08-04 18:36:52 +00:00
SIFixVGPRCopies.cpp	[AMDGPU] Add VGPR copies post regalloc fix pass	2017-01-24 17:46:17 +00:00
SIFixWWMLiveness.cpp	[AMDGPU] Add support for Whole Wavefront Mode	2017-08-04 18:36:52 +00:00
SIFoldOperands.cpp	AMDGPU: Fix crash when folding immediates into multiple uses	2017-07-18 14:54:41 +00:00
SIFrameLowering.cpp	AMDGPU: Pass special input registers to functions	2017-08-03 23:00:29 +00:00
SIFrameLowering.h	AMDGPU: Initial implementation of calls	2017-08-01 19:54:18 +00:00
SIISelLowering.cpp	[AMDGPU] Add support for Whole Wavefront Mode	2017-08-04 18:36:52 +00:00
SIISelLowering.h	AMDGPU: Pass special input registers to functions	2017-08-03 23:00:29 +00:00
SIInsertSkips.cpp	AMDGPU: Rename SI_RETURN	2017-03-21 22:18:10 +00:00
SIInsertWaitcnts.cpp	AMDGPU: Partially fix improper reliance on memoperands	2017-07-21 18:54:54 +00:00
SIInsertWaits.cpp	AMDGPU: Make auto waitcnt before barrier a feature	2017-06-02 17:40:26 +00:00
SIInstrFormats.td	AMDGPU: Introduce maybeAtomic instruction flag	2017-07-21 21:05:45 +00:00
SIInstrInfo.cpp	[AMDGPU] Add support for Whole Wavefront Mode	2017-08-04 18:36:52 +00:00
SIInstrInfo.h	AMDGPU: Make areMemAccessesTriviallyDisjoint more aware of segment flat	2017-07-29 01:26:21 +00:00
SIInstrInfo.td	AMDGPU: Remove leftover td file	2017-07-22 00:40:46 +00:00
SIInstructions.td	[AMDGPU] Add support for Whole Wavefront Mode	2017-08-04 18:36:52 +00:00
SIIntrinsics.td	AMDGPU: Remove legacy export intrinsic	2017-04-04 16:34:39 +00:00
SILoadStoreOptimizer.cpp	[LegacyPassManager] Remove TargetMachine constructors	2017-05-18 17:21:13 +00:00
SILowerControlFlow.cpp	[AMDGPU] Preserve inverted bit in SI_IF in presence of SI_KILL	2017-08-04 06:58:42 +00:00
SILowerI1Copies.cpp	Sort the remaining #include lines in include/... and lib/....	2017-06-06 11:49:48 +00:00
SIMachineFunctionInfo.cpp	AMDGPU: Fix implicitarg.ptr handling special inputs	2017-08-03 23:12:44 +00:00
SIMachineFunctionInfo.h	AMDGPU: Pass special input registers to functions	2017-08-03 23:00:29 +00:00
SIMachineScheduler.cpp	AMDGPU/SI: Fix Depth and Height computation for SI scheduler	2017-07-25 20:37:03 +00:00
SIMachineScheduler.h	AMDGPU/SI: Force exports at the end for SI scheduler	2017-07-25 20:36:58 +00:00
SIMemoryLegalizer.cpp	AMDGPU: Implement memory model	2017-07-21 21:19:23 +00:00
SIOptimizeExecMasking.cpp	[AMDGPU] Turn s_and_saveexec_b64 into s_and_b64 if result is unused	2017-08-01 23:44:35 +00:00
SIOptimizeExecMaskingPreRA.cpp	[AMDGPU] Fix asan error after last commit	2017-08-02 01:18:57 +00:00
SIPeepholeSDWA.cpp	[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions	2017-06-27 15:02:23 +00:00
SIRegisterInfo.cpp	AMDGPU: Pass special input registers to functions	2017-08-03 23:00:29 +00:00
SIRegisterInfo.h	AMDGPU: Pass special input registers to functions	2017-08-03 23:00:29 +00:00
SIRegisterInfo.td	AMDGPU: Initial implementation of calls	2017-08-01 19:54:18 +00:00
SISchedule.td	AMDGPU: Implement early ifcvt target hooks.	2017-01-25 04:25:02 +00:00
SIShrinkInstructions.cpp	AMDGPU: Allow SIShrinkInstructions to fold FrameIndexes	2017-07-10 20:04:35 +00:00
SIWholeQuadMode.cpp	[AMDGPU] Add support for Whole Wavefront Mode	2017-08-04 18:36:52 +00:00
SMInstructions.td	AMDGPUAnnotateUniformValue should always treat volatile loads as divergent	2017-06-02 15:25:52 +00:00
SOPInstructions.td	Resubmit r303859 with test fixed.	2017-05-26 20:38:26 +00:00
VIInstrFormats.td	[AMDGPU] Refactor VOP1 and VOP2 instruction TD definitions	2016-09-23 09:08:07 +00:00
VIInstructions.td	AMDGPU: Add VI i16 support	2016-11-10 16:02:37 +00:00
VOP1Instructions.td	[AMDGPU] SDWA: merge VI and GFX9 pseudo instructions	2017-06-21 08:53:38 +00:00
VOP2Instructions.td	AMDGPU: Add encoding for carryless add/sub instructions	2017-07-20 17:42:47 +00:00
VOP3Instructions.td	[AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier	2017-07-21 13:54:11 +00:00
VOP3PInstructions.td	[AMDGPU][MC] Added missing VOP3P opcodes	2017-07-18 09:24:10 +00:00
VOPCInstructions.td	[AMDGPU] resubmit r308179: CodeGen: check dst operand type to determine if omod is supported for VOP3 instructions	2017-07-18 14:23:26 +00:00
VOPInstructions.td	[AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier	2017-07-21 13:54:11 +00:00