llvm-project/llvm/lib/Target/NVPTX
Benjamin Kramer 733c7fc55d [NVPTX] Turn on Loop/SLP vectorization
Since PTX has grown a <2 x half> datatype vectorization has become more
important. The late LoadStoreVectorizer intentionally only does loads
and stores, but now arithmetic has to be vectorized for optimal
throughput too.

This is still very limited, SLP vectorization happily creates <2 x half>
if it's a legal type but there's still a lot of register moving
happening to get that fed into a vectorized store. Overall it's a small
performance win by reducing the amount of arithmetic instructions.

I haven't really checked what the loop vectorizer does to PTX code, the
cost model there might need some more tweaks. I didn't see it causing
harm though.

Differential Revision: https://reviews.llvm.org/D46130

llvm-svn: 331035
2018-04-27 13:36:05 +00:00
..
InstPrinter
MCTargetDesc [DEBUG] Initial adaptation of NVPTX target for debug info emission. 2018-04-18 16:13:41 +00:00
TargetInfo Add backend name to Target to enable runtime info to be fed back into TableGen 2017-11-15 23:55:44 +00:00
CMakeLists.txt Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txt 2018-04-23 12:49:34 +00:00
LLVMBuild.txt
ManagedStringPool.h
NVPTX.h
NVPTX.td [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions. 2018-04-18 21:51:48 +00:00
NVPTXAllocaHoisting.cpp
NVPTXAllocaHoisting.h
NVPTXAsmPrinter.cpp [DEBUG] Initial adaptation of NVPTX target for debug info emission. 2018-04-18 16:13:41 +00:00
NVPTXAsmPrinter.h [DEBUG] Initial adaptation of NVPTX target for debug info emission. 2018-04-18 16:13:41 +00:00
NVPTXAssignValidGlobalNames.cpp [NVPTX] Assign valid global names 2017-12-04 14:19:33 +00:00
NVPTXFrameLowering.cpp Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering 2017-11-08 01:01:31 +00:00
NVPTXFrameLowering.h Move TargetFrameLowering.h to CodeGen where it's implemented 2017-11-03 22:32:11 +00:00
NVPTXGenericToNVVM.cpp [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. 2018-03-29 17:21:10 +00:00
NVPTXISelDAGToDAG.cpp [NVPTX] Fixed vectorized LDG for f16. 2018-04-06 21:10:24 +00:00
NVPTXISelDAGToDAG.h [NVPTX] TblGen-ized lowering of WMMA intrinsics. 2018-03-15 21:40:56 +00:00
NVPTXISelLowering.cpp [NVPTX] Make the legalizer expand shufflevector of <2 x half> 2018-04-26 15:26:29 +00:00
NVPTXISelLowering.h TLI: Allow using PSV for intrinsic mem operands 2017-12-14 22:34:10 +00:00
NVPTXImageOptimizer.cpp
NVPTXInstrFormats.td
NVPTXInstrInfo.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
NVPTXInstrInfo.h Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering 2017-11-08 01:01:31 +00:00
NVPTXInstrInfo.td [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions. 2018-04-18 21:51:48 +00:00
NVPTXIntrinsics.td [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions. 2018-04-18 21:51:48 +00:00
NVPTXLowerAggrCopies.cpp [Memcpy Loop Lowering] Remove the fixed int8 lowering. 2017-12-18 15:31:14 +00:00
NVPTXLowerAggrCopies.h
NVPTXLowerAlloca.cpp
NVPTXLowerArgs.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
NVPTXMCExpr.cpp Avoid int to string conversion in Twine or raw_ostream contexts. 2017-12-28 16:58:54 +00:00
NVPTXMCExpr.h
NVPTXMachineFunctionInfo.h
NVPTXPeephole.cpp MachineFunction: Return reference from getFunction(); NFC 2017-12-15 22:22:58 +00:00
NVPTXPrologEpilogPass.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
NVPTXRegisterInfo.cpp Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering 2017-11-08 01:01:31 +00:00
NVPTXRegisterInfo.h Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
NVPTXRegisterInfo.td
NVPTXReplaceImageHandles.cpp MachineFunction: Return reference from getFunction(); NFC 2017-12-15 22:22:58 +00:00
NVPTXSubtarget.cpp
NVPTXSubtarget.h [NVPTX] Removed 'satom' feature which is no longer used. 2018-04-11 17:51:33 +00:00
NVPTXTargetMachine.cpp [CodeGen]Add NoVRegs property on PostRASink and ShrinkWrap 2018-04-03 18:17:34 +00:00
NVPTXTargetMachine.h (Re-landing) Expose a TargetMachine::getTargetTransformInfo function 2017-12-22 18:21:59 +00:00
NVPTXTargetObjectFile.h [DEBUG] Initial adaptation of NVPTX target for debug info emission. 2018-04-18 16:13:41 +00:00
NVPTXTargetTransformInfo.cpp Fix a bunch more layering of CodeGen headers that are in Target 2017-11-17 01:07:10 +00:00
NVPTXTargetTransformInfo.h [NVPTX] Turn on Loop/SLP vectorization 2018-04-27 13:36:05 +00:00
NVPTXUtilities.cpp
NVPTXUtilities.h
NVVMIntrRange.cpp Sort the remaining #include lines in include/... and lib/.... 2017-06-06 11:49:48 +00:00
NVVMReflect.cpp
cl_common_defines.h