[NFC][OpenMP] Update simd loop collapse support description

Simdloop collapse clause is supported in the same way
as colllapse clause for worksharing loops.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D131674

Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
This commit is contained in:
Dominik Adamski 2022-08-16 05:18:14 -05:00
parent 49223e0a2d
commit 19bd4789b9
2 changed files with 19 additions and 1 deletions

View File

@ -412,6 +412,9 @@ def SimdLoopOp : OpenMP_Op<"simdloop", [AttrSizedOperandSegments,
The body region can contain any number of blocks. The region is terminated
by "omp.yield" instruction without operands.
Collapsed loops are represented by the simd-loop having a list of indices,
bounds and steps where the size of the list is equal to the collapse value.
When an if clause is present and evaluates to false, the preferred number of
iterations to be executed concurrently is one, regardless of whether
a simdlen clause is specified.

View File

@ -712,7 +712,22 @@ llvm.func @simdloop_simple_multiple(%lb1 : i64, %ub1 : i64, %step1 : i64, %lb2 :
omp.simdloop for (%iv1, %iv2) : i64 = (%lb1, %lb2) to (%ub1, %ub2) step (%step1, %step2) {
%3 = llvm.mlir.constant(2.000000e+00 : f32) : f32
// The form of the emitted IR is controlled by OpenMPIRBuilder and
// tested there. Just check that the right metadata is added.
// tested there. Just check that the right metadata is added and collapsed
// loop bound is generated (Collapse clause is represented as a loop with
// list of indices, bounds and steps where the size of the list is equal
// to the collapse value.)
// CHECK: icmp slt i64
// CHECK-COUNT-3: select
// CHECK: %[[TRIPCOUNT0:.*]] = select
// CHECK: br label %[[PREHEADER:.*]]
// CHECK: [[PREHEADER]]:
// CHECK: icmp slt i64
// CHECK-COUNT-3: select
// CHECK: %[[TRIPCOUNT1:.*]] = select
// CHECK: mul nuw i64 %[[TRIPCOUNT0]], %[[TRIPCOUNT1]]
// CHECK: br label %[[COLLAPSED_PREHEADER:.*]]
// CHECK: [[COLLAPSED_PREHEADER]]:
// CHECK: br label %[[COLLAPSED_HEADER:.*]]
// CHECK: llvm.access.group
// CHECK-NEXT: llvm.access.group
%4 = llvm.getelementptr %arg0[%iv1] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>