forked from OSchip/llvm-project
6641657453
This commit adds a new IR level pass to the AMDGPU backend to perform atomic optimizations. It works by: - Running through a function and finding atomicrmw add/sub or uses of the atomic buffer intrinsics for add/sub. - If all arguments except the value to be added/subtracted are uniform, record the value to be optimized. - Run through the atomic operations we can optimize and, depending on whether the value is uniform/divergent use wavefront wide operations (DPP in the divergent case) to calculate the total amount to be atomically added/subtracted. - Then let only a single lane of each wavefront perform the atomic operation, reducing the total number of atomic operations in flight. - Lastly we recombine the result from the single lane to each lane of the wavefront, and calculate our individual lanes offset into the final result. Differential Revision: https://reviews.llvm.org/D51969 llvm-svn: 343973 |
||
---|---|---|
.. | ||
Analysis | ||
Assembler | ||
Bindings | ||
Bitcode | ||
BugPoint | ||
CodeGen | ||
DebugInfo | ||
Demangle | ||
Examples | ||
ExecutionEngine | ||
Feature | ||
FileCheck | ||
Instrumentation | ||
Integer | ||
JitListener | ||
LTO | ||
Linker | ||
MC | ||
Object | ||
ObjectYAML | ||
Other | ||
SafepointIRVerifier | ||
SymbolRewriter | ||
TableGen | ||
ThinLTO/X86 | ||
Transforms | ||
Unit | ||
Verifier | ||
YAMLParser | ||
tools | ||
.clang-format | ||
CMakeLists.txt | ||
TestRunner.sh | ||
lit.cfg.py | ||
lit.site.cfg.py.in |