llvm-project

History

harsh e01e4c9115 Fix bugs in GPUToNVVM lowering The current lowering from GPU to NVVM does not correctly handle the following cases when lowering the gpu shuffle op. 1. When the active width is set to 32 (all lanes), then the current approach computes (1 << 32) -1 which results in poison values in the LLVM IR. We fix this by defining the active mask as (-1) >> (32 - width). 2. In the case of shuffle up, the computation of the third operand c has to be different from the other 3 modes due to the op definition in the ISA reference. (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html) Specifically, the predicate value is computed as j >= maxLane for up and j <= maxLane for all other modes. We fix this by computing maskAndClamp as 32 - width for this mode. TEST: We modify the existing test and add more checks for the up mode. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D118086	2022-01-25 03:24:14 +00:00
..
gpu-to-nvvm.mlir	Fix bugs in GPUToNVVM lowering	2022-01-25 03:24:14 +00:00
wmma-ops-to-nvvm.mlir	[mlir] Replace StrEnumAttr -> EnumAttr in core dialects	2022-01-18 17:15:00 +00:00

harsh e01e4c9115 Fix bugs in GPUToNVVM lowering

The current lowering from GPU to NVVM does
not correctly handle the following cases when
lowering the gpu shuffle op.

1. When the active width is set to 32 (all lanes),
then the current approach computes (1 << 32) -1 which
results in poison values in the LLVM IR. We fix this by
defining the active mask as (-1) >> (32 - width).

2. In the case of shuffle up, the computation of the third
operand c has to be different from the other 3 modes due to
the op definition in the ISA reference.
(https://docs.nvidia.com/cuda/parallel-thread-execution/index.html)
Specifically, the predicate value is computed as j >= maxLane
for up and j <= maxLane for all other modes. We fix this by
computing maskAndClamp as 32 - width for this mode.

TEST: We modify the existing test and add more checks for the up mode.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D118086

2022-01-25 03:24:14 +00:00

gpu-to-nvvm.mlir

Fix bugs in GPUToNVVM lowering

2022-01-25 03:24:14 +00:00

wmma-ops-to-nvvm.mlir

[mlir] Replace StrEnumAttr -> EnumAttr in core dialects

2022-01-18 17:15:00 +00:00