llvm-project/mlir/lib/Transforms/ParallelLoopCollapsing.cpp

//===- ParallelLoopCollapsing.cpp - Pass collapsing parallel loop indices -===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "PassDetail.h"
#include "mlir/Dialect/SCF/SCF.h"
#include "mlir/Transforms/LoopUtils.h"
#include "mlir/Transforms/Passes.h"
#include "mlir/Transforms/RegionUtils.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"

#define DEBUG_TYPE "parallel-loop-collapsing"

using namespace mlir;

namespace {
struct ParallelLoopCollapsing
    : public ParallelLoopCollapsingBase<ParallelLoopCollapsing> {
  void runOnOperation() override {
    Operation *module = getOperation();

    module->walk([&](scf::ParallelOp op) {
      // The common case for GPU dialect will be simplifying the ParallelOp to 3
      // arguments, so we do that here to simplify things.
      llvm::SmallVector<std::vector<unsigned>, 3> combinedLoops;
      if (clCollapsedIndices0.size())
        combinedLoops.push_back(clCollapsedIndices0);
      if (clCollapsedIndices1.size())
        combinedLoops.push_back(clCollapsedIndices1);
      if (clCollapsedIndices2.size())
        combinedLoops.push_back(clCollapsedIndices2);
      collapseParallelLoops(op, combinedLoops);
    });
  }
};
} // namespace

std::unique_ptr<Pass> mlir::createParallelLoopCollapsingPass() {
  return std::make_unique<ParallelLoopCollapsing>();
}
[MLIR] Add parallel loop collapsing. This allows conversion of a ParallelLoop from N induction variables to some nuber of induction variables less than N. The first intended use of this is for the GPUDialect to convert ParallelLoops to iterate over 3 dimensions so they can be launched as GPU Kernels. To implement this: - Normalize each iteration space of the ParallelLoop - Use the same induction variable in a new ParallelLoop for multiple original iterations. - Split the new induction variable back into the original set of values inside the body of the ParallelLoop. Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76363 2020-03-11 21:38:10 +08:00			`//===- ParallelLoopCollapsing.cpp - Pass collapsing parallel loop indices -===//`
			`//`
			`// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.`
			`// See https://llvm.org/LICENSE.txt for license information.`
			`// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception`
			`//`
			`//===----------------------------------------------------------------------===//`

[mlir][Pass] Update the PassGen to generate base classes instead of utilities Summary: This is much cleaner, and fits the same structure as many other tablegen backends. This was not done originally as the CRTP in the pass classes made it overly verbose/complex. Differential Revision: https://reviews.llvm.org/D77367 2020-04-08 04:58:12 +08:00			`#include "PassDetail.h"`
[mlir] NFC: Rename LoopOps dialect to SCF (Structured Control Flow) This dialect contains various structured control flow operaitons, not only loops, reflect this in the name. Drop the Ops suffix for consistency with other dialects. Note that this only moves the files and changes the C++ namespace from 'loop' to 'scf'. The visible IR prefix remains the same and will be updated separately. The conversions will also be updated separately. Differential Revision: https://reviews.llvm.org/D79578 2020-05-11 21:00:48 +08:00			`#include "mlir/Dialect/SCF/SCF.h"`
[MLIR] Add parallel loop collapsing. This allows conversion of a ParallelLoop from N induction variables to some nuber of induction variables less than N. The first intended use of this is for the GPUDialect to convert ParallelLoops to iterate over 3 dimensions so they can be launched as GPU Kernels. To implement this: - Normalize each iteration space of the ParallelLoop - Use the same induction variable in a new ParallelLoop for multiple original iterations. - Split the new induction variable back into the original set of values inside the body of the ParallelLoop. Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76363 2020-03-11 21:38:10 +08:00			`#include "mlir/Transforms/LoopUtils.h"`
			`#include "mlir/Transforms/Passes.h"`
			`#include "mlir/Transforms/RegionUtils.h"`
			`#include "llvm/Support/CommandLine.h"`
			`#include "llvm/Support/Debug.h"`

[mlir][Pass] Add a tablegen backend for defining Pass information This will greatly simplify a number of things related to passes: * Enables generation of pass registration * Enables generation of boiler plate pass utilities * Enables generation of pass documentation This revision focuses on adding the basic structure and adds support for generating the registration for passes in the Transforms/ directory. Future revisions will add more support and move more passes over. Differential Revision: https://reviews.llvm.org/D76656 2020-04-01 16:48:34 +08:00			`#define DEBUG_TYPE "parallel-loop-collapsing"`
[MLIR] Add parallel loop collapsing. This allows conversion of a ParallelLoop from N induction variables to some nuber of induction variables less than N. The first intended use of this is for the GPUDialect to convert ParallelLoops to iterate over 3 dimensions so they can be launched as GPU Kernels. To implement this: - Normalize each iteration space of the ParallelLoop - Use the same induction variable in a new ParallelLoop for multiple original iterations. - Split the new induction variable back into the original set of values inside the body of the ParallelLoop. Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76363 2020-03-11 21:38:10 +08:00
			`using namespace mlir;`

			`namespace {`
[mlir][Pass] Remove the use of CRTP from the Pass classes This revision removes all of the CRTP from the pass hierarchy in preparation for using the tablegen backend instead. This creates a much cleaner interface in the C++ code, and naturally fits with the rest of the infrastructure. A new utility class, PassWrapper, is added to replicate the existing behavior for passes not suitable for using the tablegen backend. Differential Revision: https://reviews.llvm.org/D77350 2020-04-08 04:56:16 +08:00			`struct ParallelLoopCollapsing`
[mlir][Pass] Update the PassGen to generate base classes instead of utilities Summary: This is much cleaner, and fits the same structure as many other tablegen backends. This was not done originally as the CRTP in the pass classes made it overly verbose/complex. Differential Revision: https://reviews.llvm.org/D77367 2020-04-08 04:58:12 +08:00			`: public ParallelLoopCollapsingBase<ParallelLoopCollapsing> {`
[MLIR] Add parallel loop collapsing. This allows conversion of a ParallelLoop from N induction variables to some nuber of induction variables less than N. The first intended use of this is for the GPUDialect to convert ParallelLoops to iterate over 3 dimensions so they can be launched as GPU Kernels. To implement this: - Normalize each iteration space of the ParallelLoop - Use the same induction variable in a new ParallelLoop for multiple original iterations. - Split the new induction variable back into the original set of values inside the body of the ParallelLoop. Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76363 2020-03-11 21:38:10 +08:00			`void runOnOperation() override {`
			`Operation *module = getOperation();`

[mlir] NFC: Rename LoopOps dialect to SCF (Structured Control Flow) This dialect contains various structured control flow operaitons, not only loops, reflect this in the name. Drop the Ops suffix for consistency with other dialects. Note that this only moves the files and changes the C++ namespace from 'loop' to 'scf'. The visible IR prefix remains the same and will be updated separately. The conversions will also be updated separately. Differential Revision: https://reviews.llvm.org/D79578 2020-05-11 21:00:48 +08:00			`module->walk([&](scf::ParallelOp op) {`
[MLIR] Add parallel loop collapsing. This allows conversion of a ParallelLoop from N induction variables to some nuber of induction variables less than N. The first intended use of this is for the GPUDialect to convert ParallelLoops to iterate over 3 dimensions so they can be launched as GPU Kernels. To implement this: - Normalize each iteration space of the ParallelLoop - Use the same induction variable in a new ParallelLoop for multiple original iterations. - Split the new induction variable back into the original set of values inside the body of the ParallelLoop. Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76363 2020-03-11 21:38:10 +08:00			`// The common case for GPU dialect will be simplifying the ParallelOp to 3`
			`// arguments, so we do that here to simplify things.`
			`llvm::SmallVector<std::vector<unsigned>, 3> combinedLoops;`
			`if (clCollapsedIndices0.size())`
			`combinedLoops.push_back(clCollapsedIndices0);`
			`if (clCollapsedIndices1.size())`
			`combinedLoops.push_back(clCollapsedIndices1);`
			`if (clCollapsedIndices2.size())`
			`combinedLoops.push_back(clCollapsedIndices2);`
[MLIR] Rename collapsePLoops -> collapseParallelLoops Summary: Additionally, NFC code cleanups were done. This is to address additional comments on https://reviews.llvm.org/D76363 Differential Revision: https://reviews.llvm.org/D77052 2020-03-30 19:26:46 +08:00			`collapseParallelLoops(op, combinedLoops);`
[MLIR] Add parallel loop collapsing. This allows conversion of a ParallelLoop from N induction variables to some nuber of induction variables less than N. The first intended use of this is for the GPUDialect to convert ParallelLoops to iterate over 3 dimensions so they can be launched as GPU Kernels. To implement this: - Normalize each iteration space of the ParallelLoop - Use the same induction variable in a new ParallelLoop for multiple original iterations. - Split the new induction variable back into the original set of values inside the body of the ParallelLoop. Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76363 2020-03-11 21:38:10 +08:00			`});`
			`}`
			`};`
			`} // namespace`

			`std::unique_ptr<Pass> mlir::createParallelLoopCollapsingPass() {`
			`return std::make_unique<ParallelLoopCollapsing>();`
			`}`