forked from OSchip/llvm-project
48a9bdc6aa
Summary: [NVPTX] make load on global readonly memory to use ldg Summary: As describe in [1], ld.global.nc may be used to load memory by nvcc when __restrict__ is used and compiler can detect whether read-only data cache is safe to use. This patch will try to check whether ldg is safe to use and use them to replace ld.global when possible. This change can improve the performance by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in S3D benchmark of shoc [2]. Patched by Xuetian Weng. [1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache [2] https://github.com/vetter/shoc Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll Reviewers: jholewinski, jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11314 llvm-svn: 242713 |
||
---|---|---|
.. | ||
Analysis | ||
AsmParser | ||
Bitcode | ||
CodeGen | ||
DebugInfo | ||
ExecutionEngine | ||
Fuzzer | ||
IR | ||
IRReader | ||
LTO | ||
LibDriver | ||
LineEditor | ||
Linker | ||
MC | ||
Object | ||
Option | ||
Passes | ||
ProfileData | ||
Support | ||
TableGen | ||
Target | ||
Transforms | ||
CMakeLists.txt | ||
LLVMBuild.txt | ||
Makefile |