llvm-project/mlir/test/mlir-cuda-runner
Christian Sigg 1129931a62 Change all_reduce lowering to support 2D and 3D blocks.
Perform second reduce only with first warp. This requires an additional __sync_threads(), but doesn't need special handling when the last warp is small. This simplifies support for block sizes that are not multiple of 32.

Supporting partial warp reduce will be done in a separate CL.

PiperOrigin-RevId: 272168917
2019-10-01 02:51:15 -07:00
..
all-reduce.mlir Change all_reduce lowering to support 2D and 3D blocks. 2019-10-01 02:51:15 -07:00
gpu-to-cubin.mlir JitRunner: support entry functions returning void 2019-08-20 07:46:17 -07:00
lit.local.cfg Add an mlir-cuda-runner tool. 2019-07-04 07:53:54 -07:00