forked from OSchip/llvm-project
eaaf7a6a09
Add conversion of warp synchronous matrix-multiply accumulate GPU ops Add conversion of warp synchronous matrix-multiply accumulate GPU ops to NVVM ops. The following conversions are added :- 1.) subgroup_mma_load_matrix -> wmma.m16n16k16.load.[a,b,c]..row.stride 2.) subgroup_mma_store_matrix -> wmma.m16n16k16.store.d.[f16,f32].row.stride 3.) subgroup_mma_compute -> wmma.m16n16k16.mma.row.row.[f16,f32].[f16,f32] Reviewed By: bondhugula, ftynse Differential Revision: https://reviews.llvm.org/D95331 |
||
---|---|---|
.. | ||
cmake/modules | ||
docs | ||
examples | ||
include | ||
lib | ||
python | ||
test | ||
tools | ||
unittests | ||
utils | ||
.clang-format | ||
.clang-tidy | ||
CMakeLists.txt | ||
LICENSE.TXT | ||
README.md |
README.md
Multi-Level Intermediate Representation
See https://mlir.llvm.org/ for more information.