Commit Graph

5 Commits

Author SHA1 Message Date
Lei Zhang 50000abe3c [mlir] Use affine.apply when distributing to processors
This makes it easy to compose the distribution computation with
other affine computations.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D98171
2021-03-09 08:37:20 -05:00
Alex Zinenko 60f443bb3b [mlir] Change dialect namespace loop->scf
All ops of the SCF dialect now use the `scf.` prefix instead of `loop.`. This
is a part of dialect renaming.

Differential Revision: https://reviews.llvm.org/D79844
2020-05-13 19:20:21 +02:00
Mehdi Amini bab5bcf8fd Add a flag on the context to protect against creation of operations in unregistered dialects
Differential Revision: https://reviews.llvm.org/D76903
2020-03-30 19:37:31 +00:00
Mahesh Ravishankar 9cbbd8f4df Support lowering of imperfectly nested loops into GPU dialect.
The current lowering of loops to GPU only supports lowering of loop
nests where the loops mapped to workgroups and workitems are perfectly
nested. Here a new lowering is added to handle lowering of imperfectly
nested loop body with the following properties
1) The loops partitioned to workgroups are perfectly nested.
2) The loop body of the inner most loop partitioned to workgroups can
contain one or more loop nests that are to be partitioned across
workitems. Each individual loops nests partitioned to workitems should
also be perfectly nested.
3) The number of workgroups and workitems are not deduced from the
loop bounds but are passed in by the caller of the lowering as values.
4) For statements within the perfectly nested loop nest partitioned
across workgroups that are not loops, it is valid to have all threads
execute that statement. This is NOT verified.

PiperOrigin-RevId: 277958868
2019-11-01 10:52:06 -07:00
Nicolas Vasilache db4cd1c8dc Utility function to map a loop on a parametric grid of virtual processors
This CL introduces a simple loop utility function which rewrites the bounds and step of a loop so as to become mappable on a regular grid of processors whose identifiers are given by SSA values.

A corresponding unit test is added.

For example, using CUDA terminology, and assuming a 2-d grid with processorIds = [blockIdx.x, threadIdx.x] and numProcessors = [gridDim.x, blockDim.x], the loop:
```
   loop.for %i = %lb to %ub step %step {
     ...
   }
```
is rewritten into a version resembling the following pseudo-IR:
```
   loop.for %i = %lb + threadIdx.x + blockIdx.x * blockDim.x to %ub
      step %gridDim.x * blockDim.x {
     ...
   }
```

PiperOrigin-RevId: 258945942
2019-07-19 11:40:31 -07:00