forked from OSchip/llvm-project
[NFC][OpenMP] Update simd loop collapse support description
Simdloop collapse clause is supported in the same way as colllapse clause for worksharing loops. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D131674 Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
This commit is contained in:
parent
49223e0a2d
commit
19bd4789b9
|
@ -412,6 +412,9 @@ def SimdLoopOp : OpenMP_Op<"simdloop", [AttrSizedOperandSegments,
|
||||||
The body region can contain any number of blocks. The region is terminated
|
The body region can contain any number of blocks. The region is terminated
|
||||||
by "omp.yield" instruction without operands.
|
by "omp.yield" instruction without operands.
|
||||||
|
|
||||||
|
Collapsed loops are represented by the simd-loop having a list of indices,
|
||||||
|
bounds and steps where the size of the list is equal to the collapse value.
|
||||||
|
|
||||||
When an if clause is present and evaluates to false, the preferred number of
|
When an if clause is present and evaluates to false, the preferred number of
|
||||||
iterations to be executed concurrently is one, regardless of whether
|
iterations to be executed concurrently is one, regardless of whether
|
||||||
a simdlen clause is specified.
|
a simdlen clause is specified.
|
||||||
|
|
|
@ -712,7 +712,22 @@ llvm.func @simdloop_simple_multiple(%lb1 : i64, %ub1 : i64, %step1 : i64, %lb2 :
|
||||||
omp.simdloop for (%iv1, %iv2) : i64 = (%lb1, %lb2) to (%ub1, %ub2) step (%step1, %step2) {
|
omp.simdloop for (%iv1, %iv2) : i64 = (%lb1, %lb2) to (%ub1, %ub2) step (%step1, %step2) {
|
||||||
%3 = llvm.mlir.constant(2.000000e+00 : f32) : f32
|
%3 = llvm.mlir.constant(2.000000e+00 : f32) : f32
|
||||||
// The form of the emitted IR is controlled by OpenMPIRBuilder and
|
// The form of the emitted IR is controlled by OpenMPIRBuilder and
|
||||||
// tested there. Just check that the right metadata is added.
|
// tested there. Just check that the right metadata is added and collapsed
|
||||||
|
// loop bound is generated (Collapse clause is represented as a loop with
|
||||||
|
// list of indices, bounds and steps where the size of the list is equal
|
||||||
|
// to the collapse value.)
|
||||||
|
// CHECK: icmp slt i64
|
||||||
|
// CHECK-COUNT-3: select
|
||||||
|
// CHECK: %[[TRIPCOUNT0:.*]] = select
|
||||||
|
// CHECK: br label %[[PREHEADER:.*]]
|
||||||
|
// CHECK: [[PREHEADER]]:
|
||||||
|
// CHECK: icmp slt i64
|
||||||
|
// CHECK-COUNT-3: select
|
||||||
|
// CHECK: %[[TRIPCOUNT1:.*]] = select
|
||||||
|
// CHECK: mul nuw i64 %[[TRIPCOUNT0]], %[[TRIPCOUNT1]]
|
||||||
|
// CHECK: br label %[[COLLAPSED_PREHEADER:.*]]
|
||||||
|
// CHECK: [[COLLAPSED_PREHEADER]]:
|
||||||
|
// CHECK: br label %[[COLLAPSED_HEADER:.*]]
|
||||||
// CHECK: llvm.access.group
|
// CHECK: llvm.access.group
|
||||||
// CHECK-NEXT: llvm.access.group
|
// CHECK-NEXT: llvm.access.group
|
||||||
%4 = llvm.getelementptr %arg0[%iv1] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
|
%4 = llvm.getelementptr %arg0[%iv1] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
|
||||||
|
|
Loading…
Reference in New Issue