llvm-project/llvm/lib
Connor Abbott 92638ab625 [AMDGPU] Add support for Whole Wavefront Mode
Summary:
Whole Wavefront Wode (WWM) is similar to WQM, except that all of the
lanes are always enabled, regardless of control flow. This is required
for implementing wavefront reductions in non-uniform control flow, where
we need to use the inactive lanes to propagate intermediate results, so
they need to be enabled. We need to propagate WWM to uses (unless
they're explicitly marked as exact) so that they also propagate
intermediate results correctly. We do the analysis and exec mask munging
during the WQM pass, since there are interactions with WQM for things
that require both WQM and WWM. For simplicity, WWM is entirely
block-local -- blocks are never WWM on entry or exit of a block, and WWM
is not propagated to the block level.  This means that computations
involving WWM cannot involve control flow, but we only ever plan to use
WWM for a few limited purposes (none of which involve control flow)
anyways.

Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There
isn't yet a way to turn WWM off -- that will be added in a future
change.

Finally, it turns out that turning on inactive lanes causes a number of
problems with register allocation. While the best long-term solution
seems like teaching LLVM's register allocator about predication, for now
we need to add some hacks to prevent ourselves from getting into trouble
due to constraints that aren't currently expressed in LLVM. For the gory
details, see the comments at the top of SIFixWWMLiveness.cpp.

Reviewers: arsenm, nhaehnle, tpr

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D35524

llvm-svn: 310087
2017-08-04 18:36:52 +00:00
..
Analysis [Inliner] Fix a typo in option description. NFC. 2017-08-04 17:15:17 +00:00
AsmParser Debug Info: Add a file: field to DIImportedEntity. 2017-07-19 00:09:54 +00:00
BinaryFormat Revert "Revert "Revert "Revert "Switch external cvtres.exe for llvm's own resource library."""" 2017-07-08 03:06:10 +00:00
Bitcode [ThinLTO] Add FunctionAttrs to ThinLTO index 2017-08-04 16:00:58 +00:00
CodeGen [MachineOperand] Add ChangeToTargetIndex method. NFC 2017-08-04 18:24:09 +00:00
DebugInfo [PDB] Fix section contributions 2017-08-03 21:15:09 +00:00
Demangle [ItaniumDemangle] Fix a exponential string copying bug 2017-05-28 23:24:52 +00:00
ExecutionEngine Delete Default and JITDefault code models 2017-08-03 02:16:21 +00:00
Fuzzer Fixing buildbots: do not register check-fuzzer if clang or asan are not 2017-08-04 17:43:29 +00:00
IR Prevent unused warning in non-assert builds (introduced in r310014). 2017-08-04 05:05:29 +00:00
IRReader
LTO Delete Default and JITDefault code models 2017-08-03 02:16:21 +00:00
LineEditor
Linker [Linker] Add directives to support mixing ARM/Thumb module-level inline asm. 2017-07-12 11:52:28 +00:00
MC Don't pass the code model to MC 2017-08-02 20:32:26 +00:00
Object Don't pass the code model to MC 2017-08-02 20:32:26 +00:00
ObjectYAML [yaml2obj][ELF] Add support for program headers 2017-07-19 20:38:46 +00:00
Option [Bash-autocompletion] Show HelpText with possible flags 2017-07-26 13:36:58 +00:00
Passes [PM] Split LoopUnrollPass and make partial unroller a function pass 2017-08-02 20:35:29 +00:00
ProfileData Fix the bug when SampleProfileWriter writes out number of callsites. 2017-08-03 00:09:18 +00:00
Support [Support] Remove getPathFromOpenFD, it was unused 2017-08-04 17:43:49 +00:00
TableGen [BinaryFormat, Option, TableGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). 2017-06-16 00:43:26 +00:00
Target [AMDGPU] Add support for Whole Wavefront Mode 2017-08-04 18:36:52 +00:00
Testing Mark LLVMTestingSupport as not installed in LLVMBuild. 2017-06-19 22:01:50 +00:00
ToolDrivers llvm: add llvm-dlltool support to the archiver 2017-07-18 21:26:38 +00:00
Transforms [ArgPromotion] Preserve alignment of byval argument in new alloca 2017-08-04 17:09:11 +00:00
WindowsManifest Unlink nodes instead of copying, to avoid memory problems. 2017-07-26 18:33:21 +00:00
XRay Xray docs with description of Flight Data Recorder binary format. 2017-08-02 21:47:27 +00:00
CMakeLists.txt Move manifest utils into separate lib, to reduce libxml2 deps. 2017-07-26 01:21:55 +00:00
LLVMBuild.txt Move manifest utils into separate lib, to reduce libxml2 deps. 2017-07-26 01:21:55 +00:00