Commit Graph

5516 Commits

Author SHA1 Message Date
Alex Zinenko 5c73db24df [mlir] disallow side-effecting ops in llvm.mlir.global
The llvm.mlir.global operation accepts a region as initializer. This region
corresponds to an LLVM IR constant expression and therefore should not accept
operations with side effects. Add a corresponding verifier.

Reviewed By: wsmoses, bondhugula

Differential Revision: https://reviews.llvm.org/D120632
2022-03-01 14:16:09 +01:00
gysit e9085d0d25 [mlir][OpDSL] Rename function to make signedness explicit (NFC).
The revision renames the following OpDSL functions:
```
TypeFn.cast -> TypeFn.cast_signed
BinaryFn.min -> BinaryFn.min_signed
BinaryFn.max -> BinaryFn.max_signed
```
The corresponding enum values on the C++ side are renamed accordingly:
```
#linalg.type_fn<cast> -> #linalg.type_fn<cast_signed>
#linalg.binary_fn<min> -> #linalg.binary_fn<min_signed>
#linalg.binary_fn<max> -> #linalg.binary_fn<max_signed>
```

Depends On D120110

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120562
2022-03-01 08:15:53 +00:00
gysit 24357fec8d [mlir][OpDSL] Add arithmetic function attributes.
The revision extends OpDSL with unary and binary function attributes. A function attribute, makes the operations used in the body of a structured operation configurable. For example, a pooling operation may take an aggregation function attribute that specifies if the op shall implement a min or a max pooling. The goal of this revision is to define less and more flexible operations.

We may thus for example define an element wise op:
```
linalg.elem(lhs, rhs, outs=[out], op=BinaryFn.mul)
```
If the op argument is not set the default operation is used.

Depends On D120109

Reviewed By: nicolasvasilache, aartbik

Differential Revision: https://reviews.llvm.org/D120110
2022-03-01 07:45:47 +00:00
Michael Kruse a66f7769a3 [OpenMPIRBuilder] Implement static-chunked workshare-loop schedules.
Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads.

This patch includes the following related changes:
 * Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder.
 * Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder.
 * Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop.
 * Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause.

Differential Revision: https://reviews.llvm.org/D114413
2022-02-28 18:18:33 -06:00
Okwan Kwon 4c901bf447 [mlir] Match Arithmetic::ConstantOp and Tensor::ExtractSliceOp.
Add a pattern matcher for ExtractSliceOp when its source is a constant.

The matching heuristics can be governed by the control function since
generating a new constant is not always beneficial.

Differential Revision: https://reviews.llvm.org/D119605
2022-02-28 23:09:03 +00:00
Lei Zhang 96bc2233c4 [mlir][linalg] Enhance FoldInsertPadIntoFill to support op chain
If we have a chain of `tensor.insert_slice` ops inserting some
`tensor.pad` op into a `linalg.fill` and ranges do not overlap,
we can also elide the `tensor.pad` later.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D120446
2022-02-28 16:51:17 -05:00
Lei Zhang 5d47332783 [mlir][linalg] Fold tensor.pad when inserting into linalg.fill
Fold tensor.insert_slice(tensor.pad(<input>), linalg.fill) into
tensor.insert_slice(<input>, linalg.fill) if the padding value and
the filling value are the same.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D120410
2022-02-28 16:42:32 -05:00
Okwan Kwon 4f5eb53e68 Revert "[mlir] Fold Arithmetic::ConstantOp and Tensor::ExtractSliceOp."
This reverts commit 3104994104.
2022-02-28 19:14:05 +00:00
Okwan Kwon 3104994104 [mlir] Fold Arithmetic::ConstantOp and Tensor::ExtractSliceOp.
Fold ExtractSliceOp when the source is a constant.
2022-02-28 17:47:29 +00:00
gysit 11d144c576 [mlir][linalg] Check the iterator types are valid.
Improve the LinalgOp verification to ensure the iterator types is known. Previously, unknown iterator types have been ignored without warning, which can lead to confusing bugs.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D120649
2022-02-28 11:25:40 +00:00
Adrian Kuegel 44adca60d4 [mlir] Remove unused static variables (NFC) 2022-02-28 11:52:39 +01:00
Alexander Belyaev 1a829d2d06 [mlir] Purge linalg.tiled_loop.
Differential Revision: https://reviews.llvm.org/D119415
2022-02-28 09:05:18 +01:00
River Riddle b474ca1d5a [PDLL] Properly error out on returning results from native constraints
PDL currently doesn't support result values from constraints, meaning we need
to error out until this is actually supported to avoid crashes.

Differential Revision: https://reviews.llvm.org/D119782
2022-02-26 11:08:51 -08:00
River Riddle 9ad64a5c78 [mlir:PDLL] Add support for C++ generation
This commits adds a C++ generator to PDLL that generates wrapper PDL patterns
directly usable in C++ code, and also generates the definitions of native constraints/rewrites
that have code bodies specified in PDLL. This generator is effectively the PDLL equivalent of
the current DRR generator, and will allow easy replacement of DRR patterns with PDLL patterns.
A followup will start to utilize this for end-to-end integration testing and show case how to
use this as a drop-in replacement for DRR tablegen usage.

Differential Revision: https://reviews.llvm.org/D119781
2022-02-26 11:08:51 -08:00
River Riddle a486cf5e98 [mlir:PDLL] Fix handling of unspecified operands/results on operation expressions
If the operand list or result list of an operation expression is not specified, we interpret
this as meaning that the operands/results are "unconstraint" (i.e. "could be anything").
We currently don't properly handle differentiating this case from the case of
"no operands/results". This commit adds the insertion of implicit value/type range
variables when these lists are unspecified. This allows for adding proper support
for when zero operands or results are expected.

Differential Revision: https://reviews.llvm.org/D119780
2022-02-26 11:08:51 -08:00
River Riddle 95b4e88b1d [mlir:PDLL] Add support for PDL MLIR code generation
This commits starts to plumb PDLL down into MLIR and adds an initial
PDL generator. After this commit, we will have conceptually support
end-to-end execution of PDLL. Followups will add CPP generation to
match the current DRR setup, and begin to add various end-to-end
tests to test PDLL execution.

Differential Revision: https://reviews.llvm.org/D119779
2022-02-26 11:08:51 -08:00
Aart Bik 180c9f9efe [mlir][sparse] enable scalar test
Removed TODO now that we support scalars properly

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D120590
2022-02-25 15:05:25 -08:00
Bixia Zheng 6f07191101 [mlir][sparse][taco] Support reduction to scalar tensors.
The PyTACO DSL doesn't support reduction to scalars. This change
enhances the MLIR-PyTACO implementation to support reduction to scalars.

Extend an existing test to show the syntax of reduction to scalars and
two methods to retrieve the scalar values.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120572
2022-02-25 14:17:45 -08:00
Hanhan Wang 748bf4bb28 [mlir][Linalg] Add support for tileFuseAndDistribute on tensors.
This extends TileAndFuse to handle distribution on tensors.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D120441
2022-02-25 11:51:11 -08:00
Diego Caballero 875bbce9f7 [mlir][Vector] Prevent AVX2 lowering for non-f32 transpose ops
The AVX2 lowering for transpose operations is only applicable to f32 vector types.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120427
2022-02-25 19:27:32 +00:00
Diego Caballero d7e0a0846b [mlir][Vector] Generalize AVX2 transpose lowering to n-D vectors
The existing AVX2 lowering patterns for the transpose op only triggers if the
input vector is 2-D. This patch extends the patterns to trigger for n-D vectors
which are effectively 2-D vectors (e.g., vector<1x4x1x8x1). The main constraint
for the generalized AVX2 patterns to be applicable to these vectors is that the
dimensions that are greater than one must be transposed. Otherwise, the existing
patterns are not applicable.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D119505
2022-02-25 19:27:32 +00:00
Chia-hung Duan 9445b39673 [mlir] Support verification order (2/3)
This change gives explicit order of verifier execution and adds
    `hasRegionVerifier` and `verifyWithRegions` to increase the granularity
    of verifier classification. The orders are as below,

    1. InternalOpTrait will be verified first, they can be run independently.
    2. `verifyInvariants` which is constructed by ODS, it verifies the type,
       attributes, .etc.
    3. Other Traits/Interfaces that have marked their verifier as
       `verifyTrait` or `verifyWithRegions=0`.
    4. Custom verifier which is defined in the op and has marked
       `hasVerifier=1`

    If an operation has regions, then it may have the second phase,

    5. Traits/Interfaces that have marked their verifier as
       `verifyRegionTrait` or
       `verifyWithRegions=1`. This implies the verifier needs to access the
       operations in its regions.
    6. Custom verifier which is defined in the op and has marked
       `hasRegionVerifier=1`

    Note that the second phase will be run after the operations in the
    region are verified. Based on the verification order, you will be able to
    avoid verifying duplicate things.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D116789
2022-02-25 19:04:56 +00:00
Bixia Zheng c601dfbcc2 [mlir][sparse][taco] Use np.array_equal to compare integer values.
Fix MLIR-PyTACO and some tests to use np.array_equal to compare integer
values.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120526
2022-02-25 07:38:15 -08:00
gysit cd2776b0d5 [mlir][OpDSL] Split arithmetic functions.
Split arithmetic function into unary and binary functions. The revision prepares the introduction of unary and binary function attributes that work similar to type function attributes.

Depends On D120108

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120109
2022-02-25 15:27:42 +00:00
Bixia Zheng 90f22ab3ad [mlir][sparse][taco] Add support for scalar tensors.
This change allows the use of scalar tensors with index 0 in tensor index
expressions. In this case, the scalar value is broadcast to match the
dimensions of other tensors in the same expression.

Using scalar tensors as a destination in tensor index expressions is not
supported in the PyTACO DSL.

Add a PyTACO test to show the use of scalar tensors.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120524
2022-02-25 07:20:15 -08:00
gysit 4d4cb17da8 [mlir][OpDSL] Refactor function handling.
Prepare the OpDSL function handling to introduce more function classes. A follow up commit will split ArithFn into UnaryFn and BinaryFn. This revision prepares the split by adding a function kind enum to handle different function types using a single class on the various levels of the stack (for example, there is now one TensorFn and one ScalarFn).

Depends On D119718

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120108
2022-02-25 15:05:32 +00:00
gysit 51fdd802c7 [mlir][OpDSL] Add type function attributes.
Previously, OpDSL operation used hardcoded type conversion operations (cast or cast_unsigned). Supporting signed and unsigned casts thus meant implementing two different operations. Type function attributes allow us to define a single operation that has a cast type function attribute which at operation instantiation time may be set to cast or cast_unsigned. We may for example, defina a matmul operation with a cast argument:

```
@linalg_structured_op
def matmul(A=TensorDef(T1, S.M, S.K), B=TensorDef(T2, S.K, S.N), C=TensorDef(U, S.M, S.N, output=True),
    cast=TypeFnAttrDef(default=TypeFn.cast)):
  C[D.m, D.n] += cast(U, A[D.m, D.k]) * cast(U, B[D.k, D.n])
```

When instantiating the operation the attribute may be set to the desired cast function:

```
linalg.matmul(lhs, rhs, outs=[out], cast=TypeFn.cast_unsigned)
```

The revsion introduces a enum in the Linalg dialect that maps one-by-one to the type functions defined by OpDSL.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D119718
2022-02-25 08:25:23 +00:00
Thomas Raoux b1357fe618 [mlir][memref] Add transformation to do loop multi-buffering
This transformation is useful to break dependency between consecutive loop
iterations by increasing the size of a temporary buffer. This is usually
combined with heavy software pipelining.

Differential Revision: https://reviews.llvm.org/D119406
2022-02-24 09:41:21 -08:00
Marius Brehler 1fa1251116 [mlir][emitc] Add a variable op
This adds a variable op, emitted as C/C++ locale variable, which can be
used if the `emitc.constant` op is not sufficient.

As an example, the canonicalization pass would transform
```mlir
%0 = "emitc.constant"() {value = 0 : i32} : () -> i32
%1 = "emitc.constant"() {value = 0 : i32} : () -> i32
%2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32>
%3 = emitc.apply "&"(%1) : (i32) -> !emitc.ptr<i32>
emitc.call "write"(%2, %3) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> ()
```
into
```mlir
%0 = "emitc.constant"() {value = 0 : i32} : () -> i32
%1 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32>
%2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32>
emitc.call "write"(%1, %2) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> ()
```
resulting in pointer aliasing, as %1 and %2 point to the same address.
In such a case, the `emitc.variable` operation can be used instead.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D120098
2022-02-24 15:25:21 +00:00
Benjamin Kramer 92cf9f1481 [mlir][linalg] Cast back to the original type after making linalg.generic outputs more static
This codepath was entirely untested.

Differential Revision: https://reviews.llvm.org/D120473
2022-02-24 13:35:54 +01:00
Javier Setoain cd0d21b47b [mlir][LLVM] Allow scalable vectors in ShuffleVectorOp
The current implementation of ShuffleVectorOp assumes all vectors are
scalable. LLVM IR allows shufflevector operations on scalable vectors,
and the current translation between LLVM Dialect and LLVM IR does the
rigth thing when the shuffle mask is all zeroes. This is required to
do a splat operation on a scalable vector, but it doesn't make sense
for scalable vectors outside of that operation, i.e.: with non-all zero
masks.

Differential Revision: https://reviews.llvm.org/D118371
2022-02-24 11:24:34 +00:00
Matthias Springer 25bc684603 [mlir][linalg][bufferize] Always bufferize in-place with "out" operands by default
In D115022, we introduced an optimization where OpResults of a `linalg.generic` may bufferize in-place with an "in" OpOperand if the corresponding "out" OpOperand is not used in the computation.

This optimization can lead to unexpected behavior if the newly chosen OpOperand is in the same alias set as another OpOperand (that is used in the computation). In that case, the newly chosen OpOperand must bufferize out-of-place. This can be confusing to users, as always choosing the "out" OpOperand (regardless of whether it is used) would be expected when having the notion of "destination-passing style" in mind.

With this change, we go back to always bufferizing in-place with "out" OpOperands by default, but letting users override the behavior with a bufferization option.

Differential Revision: https://reviews.llvm.org/D120182
2022-02-24 19:58:05 +09:00
rkayaith e9db306dcd [mlir][python] Support more types in IntegerAttr.value
Previously only accessing values for `index` and signless int types
would work; signed and unsigned ints would hit an assert in
`IntegerAttr::getInt`. This exposes `IntegerAttr::get{S,U}Int` to the C
API and calls the appropriate function from the python bindings.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D120194
2022-02-24 10:26:31 +01:00
Bixia Zheng c8ae8cfb5d [mlir][sparse][taco] Add support for float32.
Previously, we only support float64. We now support float32 and float64. When
constructing a tensor without providing a data type, the default is float32.

Fix the tests to data type consistency. All PyTACO application tests now use
float32 to match the default data type of TACO. Other tests may use float32 or
float64.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120356
2022-02-23 18:24:22 -08:00
Aart Bik 652b39b46f [mlir][sparse][linalg] add linalg rewriting specific to sparse tensors
Now that sparse tensor types are first-class citizens and the sparse compiler
is taking shape, it is time to make sure other compiler optimizations compose
well with sparse tensors. Mostly, this should be completely transparent (i.e.,
dense and sparse take the same path). However, in some cases, optimizations
only make sense in the context of sparse tensors. This is a first example of
such an optimization, where fusing a sampled elt-wise multiplication only makes
sense when the resulting kernel has a potential lower asymptotic complexity due
to the sparsity.

As an extreme example, running SDDMM with 1024x1024 matrices and a sparse
sampling matrix with only two elements runs in 463.55ms in the unfused
case but just 0.032ms in the fused case, with a speedup of 14485x that
is only possible in the exciting world of sparse computations!

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D120429
2022-02-23 17:29:41 -08:00
William S. Moses 1b2a1f8473 [MLIR][Arith] Canonicalize cmpf(int to fp) to cmpi
Given a cmpf of either uitofp or sitofp and a constant, attempt to canonicalize it to a cmpi.

This PR rewrites equivalent code within LLVM to now apply to MLIR arith.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D117257
2022-02-23 14:09:20 -05:00
Eugene Zhulenev beff16f7bd [mlir] Async: update condition for dispatching block-aligned compute function
+ compare block size with the unrollable inner dimension
+ reduce nesting in the code and simplify a bit IR building

Reviewed By: cota

Differential Revision: https://reviews.llvm.org/D120075
2022-02-23 10:29:55 -08:00
Aart Bik 8b83b8f131 [mlir][sparse] refactor sparse compiler pipeline to single place
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D120347
2022-02-22 16:23:56 -08:00
Okwan Kwon f79f430d4b Fold Tensor.extract_slice into a constant splat.
Fold arith.extract_slice into arith.constant when the source is a constant
splat and the result type is statically shaped.
2022-02-22 21:39:57 +00:00
Matthias Springer d2dacde5d8 [mlir][bufferize][NFC] Rename `comprehensive-function-bufferize` to `one-shot-bufferize`
The related functionality is moved over to the bufferization dialect. Test cases are cleaned up a bit.

Differential Revision: https://reviews.llvm.org/D120191
2022-02-22 17:19:20 +09:00
Matthias Springer 41cb504b7c [mlir][linalg][bufferize][NFC] Move interface impl to Linalg Transforms
This is for consistency with other dialects.

Differential Revision: https://reviews.llvm.org/D120190
2022-02-21 17:14:24 +09:00
Prateek Gupta 1a2bb03eda [MLIR][LINALG] Add canonicalization pattern in `linalg.generic` op for static shape inference.
This commit adds canonicalization pattern in `linalg.generic` op
for static shape inference. If any of the inputs or outputs have
static shape or is casted from a tensor of static shape, then
shapes of all the inputs and outputs can be inferred by using the
affine map of the static shape input/output.

Signed-Off-By: Prateek Gupta <prateek@nod-labs.com>

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D118929
2022-02-21 07:51:13 +00:00
Shraiysh Vaishay c1e4e01945 [mlir][OpenMP] Added assemblyFormat for SectionsOp
This patch adds assemblyFormat for omp.sections operation.

Some existing functions have been altered to fit the custom directive
in assemblyFormat. This has led to their callsites to get modified too,
but those will be removed in later patches, when other operations get
their assemblyFormat. All operations were not changed in one patch for
ease of review.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D120176
2022-02-21 13:01:49 +05:30
Matthias Springer 4ec00fb3ea [mlir][bufferize] Add a way for ops to fail the analysis
Add `BufferizableOpInterface::verifyAnalysis`. Ops can implement this method to check for expected invariants and limitations.

The purpose of this change is to introduce a modular way of checking assertions such as `assertScfForAliasingProperties`.

Differential Revision: https://reviews.llvm.org/D120189
2022-02-20 05:51:18 +09:00
Shraiysh Vaishay 39151717db [mlir][OpenMP] Added assemblyFormat for ParallelOp
This patch adds assemblyFormat for omp.parallel operation.

Some existing functions have been altered to fit the custom directive
in assemblyFormat. This has led to their callsites to get modified too,
but those will be removed in later patches, when other operations get
their assemblyFormat. All operations were not changed in one patch for
ease of review.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D120157
2022-02-19 10:28:58 +05:30
Aart Bik 9b9a084af0 [mlir][sparse][pytaco] test with 3-dim tensor and scalar
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D120163
2022-02-18 15:16:35 -08:00
Aart Bik 6438783fda [mlir][sparse] provide more types for external to/from MLIR routines
These routines will need to be specialized a lot more based on value types,
index types, pointer types, and permutation/dimension ordering. This is a
careful first step, providing some functionality needed in PyTACO bridge.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D120154
2022-02-18 13:36:52 -08:00
Shraiysh Vaishay 60210f9acb [mlir][OpenMP] Added assemblyformat for TargetOp
This patch removes custom parser/printer for `omp.target` and adds
assemblyformat.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D120138
2022-02-19 01:22:59 +05:30
Shraiysh Vaishay 5ee500acbb [mlir][OpenMP] Remove clauses that are not being handled
This patch removes the following clauses from OpenMP Dialect:

 - private
 - firstprivate
 - lastprivate
 - shared
 - default
 - copyin
 - copyprivate

The privatization clauses are being handled in the flang frontend. The
data copying clauses are not being handled anywhere for now. Once
we have a better picture of how to handle these clauses in OpenMP
Dialect, we can add these. For the time being, removing unneeded
clauses.

For detailed discussion about this refer to [[ https://discourse.llvm.org/t/rfc-privatisation-in-openmp-dialect/3526 | Privatisation in OpenMP dialect ]]

Reviewed By: kiranchandramohan, clementval

Differential Revision: https://reviews.llvm.org/D120029
2022-02-19 01:13:05 +05:30
William S. Moses 670aeece51 [MLIR][OpenMP][SCF] Mark parallel regions as allocation scopes
MLIR has the notion of allocation scopes which specify that stack allocations (e.g. memref.alloca, llvm.alloca) should be freed or equivalently aren't available at the end of the corresponding region.
Currently neither OpenMP parallel nor SCF parallel regions have the notion of such a scope.

This clearly makes sense for an OpenMP parallel as this is implemented in with a new function which outlines the region, and clearly any allocations in that newly outlined function have a lifetime that ends at the return of the function, by definition.

While SCF.parallel doesn't have a guaranteed runtime which it is implemented with, this similarly makes sense for SCF.parallel since otherwise an allocation within an SCF.parallel will needlessly continue to allocate stack memory that isn't cleaned up until the function (or other allocation scope op) which contains the SCF.parallel returns. This means that it is impossible to represent thread or iteration-local memory without causing a stack blow-up. In the case that this stack-blow-up behavior is intended, this can be equivalently represented with an allocation outside of the SCF.parallel with a size equal to the number of iterations.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D119743
2022-02-18 11:06:32 -05:00