forked from OSchip/llvm-project
733c7fc55d
Since PTX has grown a <2 x half> datatype vectorization has become more important. The late LoadStoreVectorizer intentionally only does loads and stores, but now arithmetic has to be vectorized for optimal throughput too. This is still very limited, SLP vectorization happily creates <2 x half> if it's a legal type but there's still a lot of register moving happening to get that fed into a vectorized store. Overall it's a small performance win by reducing the amount of arithmetic instructions. I haven't really checked what the loop vectorizer does to PTX code, the cost model there might need some more tweaks. I didn't see it causing harm though. Differential Revision: https://reviews.llvm.org/D46130 llvm-svn: 331035 |
||
---|---|---|
.. | ||
InstPrinter | ||
MCTargetDesc | ||
TargetInfo | ||
CMakeLists.txt | ||
LLVMBuild.txt | ||
ManagedStringPool.h | ||
NVPTX.h | ||
NVPTX.td | ||
NVPTXAllocaHoisting.cpp | ||
NVPTXAllocaHoisting.h | ||
NVPTXAsmPrinter.cpp | ||
NVPTXAsmPrinter.h | ||
NVPTXAssignValidGlobalNames.cpp | ||
NVPTXFrameLowering.cpp | ||
NVPTXFrameLowering.h | ||
NVPTXGenericToNVVM.cpp | ||
NVPTXISelDAGToDAG.cpp | ||
NVPTXISelDAGToDAG.h | ||
NVPTXISelLowering.cpp | ||
NVPTXISelLowering.h | ||
NVPTXImageOptimizer.cpp | ||
NVPTXInstrFormats.td | ||
NVPTXInstrInfo.cpp | ||
NVPTXInstrInfo.h | ||
NVPTXInstrInfo.td | ||
NVPTXIntrinsics.td | ||
NVPTXLowerAggrCopies.cpp | ||
NVPTXLowerAggrCopies.h | ||
NVPTXLowerAlloca.cpp | ||
NVPTXLowerArgs.cpp | ||
NVPTXMCExpr.cpp | ||
NVPTXMCExpr.h | ||
NVPTXMachineFunctionInfo.h | ||
NVPTXPeephole.cpp | ||
NVPTXPrologEpilogPass.cpp | ||
NVPTXRegisterInfo.cpp | ||
NVPTXRegisterInfo.h | ||
NVPTXRegisterInfo.td | ||
NVPTXReplaceImageHandles.cpp | ||
NVPTXSubtarget.cpp | ||
NVPTXSubtarget.h | ||
NVPTXTargetMachine.cpp | ||
NVPTXTargetMachine.h | ||
NVPTXTargetObjectFile.h | ||
NVPTXTargetTransformInfo.cpp | ||
NVPTXTargetTransformInfo.h | ||
NVPTXUtilities.cpp | ||
NVPTXUtilities.h | ||
NVVMIntrRange.cpp | ||
NVVMReflect.cpp | ||
cl_common_defines.h |