[Scheduling] Define problem to model operator chaining in cyclic problem. (#6485)

This PR defines a ChainingCyclicProblem, which models a hybrid problem of ChainingProblem and CyclicProblem and adapts the simplex scheduler to solve this problem. It is mainly done by reusing and adapting codes from the two base problems. This problem represents the problem a static HLS tool will solve for loop pipelining.

---------

Co-authored-by: leothaud <dylan.leothaud@irisa.fr>
This commit is contained in:
leothaud 2023-12-12 09:43:26 +01:00 committed by GitHub
parent 4c2fd5da7d
commit 2a0deb37cb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
11 changed files with 380 additions and 2 deletions

View File

@ -155,7 +155,7 @@ for (auto &op : prob.getOperations())
llvm::dbgs() << *prob.getStartTime(&op) << "\n";
```
And that's it! For a more practical example, have a look at the
And that's it! For a more practical example, have a look at the
[`AffineToPipeline`](https://github.com/llvm/circt/blob/main/lib/Conversion/AffineToPipeline/AffineToPipeline.cpp)
pass.
@ -273,6 +273,14 @@ as well as redundant iteration over the problem components.
- [ChainingProblem](https://circt.llvm.org/doxygen/classcirct_1_1scheduling_1_1ChainingProblem.html):
Extends `Problem` to consider the accumulation of physical propagation delays
on combinational paths along SSA dependences.
- [ChainingCyclicProblem](https://circt.llvm.org/doxygen/classcirct_1_1scheduling_1_1ChainingCyclicProblem.html):
Extends `ChainingProblem` and `CyclicProblem` to consider the accumulation
of physical propagation delays on combinational paths along SSA dependences
on a cyclic scheduling problem. Note that the problem does not model
propagation delays along inter-iteration dependences. These are commonly
represented as auxiliary dependences, which are already excluded in the
parent ChainingProblem. In addition, the ChainingCyclicProblem explicitly
prohibits the use of def-use dependences with a non-zero distance.
NB: The classes listed above each model a *trait*-like aspect of scheduling.
These can be used as-is, but are also intended for mixing and matching, even
@ -300,7 +308,7 @@ chaining-enabled modulo scheduling problem.
## Utilities
See
See
[`Utilities.h`](https://github.com/llvm/circt/blob/main/include/circt/Scheduling/Utilities.h):
- Topological graph traversal
- DFA to compute combinational path delays

View File

@ -520,6 +520,18 @@ struct Default<scheduling::ModuloProblem> {
Default<scheduling::CyclicProblem>::instanceProperties;
};
template <>
struct Default<scheduling::ChainingCyclicProblem> {
static constexpr auto operationProperties =
Default<scheduling::ChainingProblem>::operationProperties;
static constexpr auto operatorTypeProperties =
Default<scheduling::ChainingProblem>::operatorTypeProperties;
static constexpr auto dependenceProperties =
Default<scheduling::CyclicProblem>::dependenceProperties;
static constexpr auto instanceProperties =
Default<scheduling::CyclicProblem>::instanceProperties;
};
} // namespace ssp
} // namespace circt

View File

@ -62,6 +62,18 @@ LogicalResult scheduleSimplex(ModuloProblem &prob, Operation *lastOp);
LogicalResult scheduleSimplex(ChainingProblem &prob, Operation *lastOp,
float cycleTime);
/// Solve the resource-free cyclic, chaining-enabled problem using a linear
/// programming-based and a handwritten implementation of the simplex algorithm.
/// This approach is an hybrid approach of the ChainingProblem simplex scheduler
/// and the CyclicProblem simplex scheduler. The objectives include determining
/// the smallest feasible initiation interval, and to minimize the start time of
/// a given \p lastOp. Fails if the dependence graph contains cycles that does
/// not include at least one edge with a non-zero distance, individual operator
/// types have delays larger than \p cycleTime, or \p prob does not include
/// \p lastOp.
LogicalResult scheduleSimplex(ChainingCyclicProblem &prob, Operation *lastOp,
float cycleTime);
/// Solve the basic problem using linear programming and an external LP solver.
/// The objective is to minimize the start time of the given \p lastOp. Fails if
/// the dependence graph contains cycles, or \p prob does not include \p lastOp.

View File

@ -454,6 +454,29 @@ public:
virtual LogicalResult verify() override;
};
/// This class models the accumulation of physical propagation delays on
/// combinational paths along SSA dependences on a cyclic scheduling problem.
///
/// Each operator type is annotated with estimated values for incoming and
/// outgoing delays. Combinational operators (zero-latency, no internal
/// registers) have only a single delay; this important special case is modeled
/// by setting the incoming and outgoing delays to the same values.
///
/// A solution to this problem comprises per-operation start times in a
/// continuous unit, e.g. in nanoseconds, inside the discrete time steps/cycles
/// determined by the underlying scheduling problem. Its solution can be used to
/// construct a pipelined datapath with a fixed, integer initiation interval,
/// in which the execution of multiple iterations/samples/etc. may overlap.
class ChainingCyclicProblem : public virtual ChainingProblem,
public virtual CyclicProblem {
DEFINE_COMMON_MEMBERS(ChainingCyclicProblem)
public:
LogicalResult checkDefUse(Dependence dep);
LogicalResult check() override;
LogicalResult verify() override;
};
} // namespace scheduling
} // namespace circt

View File

@ -46,6 +46,8 @@ void PrintPass::runOnOperation() {
printInstance<SharedOperatorsProblem>(instOp, os);
else if (probName.equals("ModuloProblem"))
printInstance<ModuloProblem>(instOp, os);
else if (probName.equals("ChainingCyclicProblem"))
printInstance<ChainingCyclicProblem>(instOp, os);
else {
auto instName = instOp.getSymName().value_or("unnamed");
llvm::errs() << "ssp-print-instance: Unknown problem class '" << probName

View File

@ -43,6 +43,8 @@ static InstanceOp roundtrip(InstanceOp instOp, bool check, bool verify,
return roundtripAs<ModuloProblem>(instOp, check, verify, builder);
if (problemName.equals("ChainingProblem"))
return roundtripAs<ChainingProblem>(instOp, check, verify, builder);
if (problemName.equals("ChainingCyclicProblem"))
return roundtripAs<ChainingCyclicProblem>(instOp, check, verify, builder);
llvm::errs() << "ssp-roundtrip: Unknown problem '" << problemName << "'\n";
return {};

View File

@ -97,6 +97,18 @@ static InstanceOp scheduleChainingProblemWithSimplex(InstanceOp instOp,
return saveProblem(prob, builder);
}
static InstanceOp scheduleChainingCyclicProblemWithSimplex(InstanceOp instOp,
Operation *lastOp,
float cycleTime,
OpBuilder &builder) {
auto prob = loadProblem<scheduling::ChainingCyclicProblem>(instOp);
if (failed(prob.check()) ||
failed(scheduling::scheduleSimplex(prob, lastOp, cycleTime)) ||
failed(prob.verify()))
return {};
return saveProblem(prob, builder);
}
static InstanceOp scheduleWithSimplex(InstanceOp instOp, StringRef options,
OpBuilder &builder) {
auto lastOp = getLastOp(instOp, options);
@ -126,6 +138,14 @@ static InstanceOp scheduleWithSimplex(InstanceOp instOp, StringRef options,
"ChainingProblem simplex scheduler\n";
return {};
}
if (problemName.equals("ChainingCyclicProblem")) {
if (auto cycleTime = getCycleTime(options))
return scheduleChainingCyclicProblemWithSimplex(
instOp, lastOp, cycleTime.value(), builder);
llvm::errs() << "ssp-schedule: Missing option 'cycle-time' for "
"ChainingCyclicProblem simplex scheduler\n";
return {};
}
llvm::errs() << "ssp-schedule: Unsupported problem '" << problemName
<< "' for simplex scheduler\n";

View File

@ -407,6 +407,37 @@ LogicalResult ModuloProblem::verify() {
return success();
}
//===----------------------------------------------------------------------===//
// ChainingCyclicProblem
//===----------------------------------------------------------------------===//
LogicalResult ChainingCyclicProblem::checkDefUse(Dependence dep) {
if (!dep.isAuxiliary() && (getDistance(dep).value_or(0) != 0))
return getContainingOp()->emitError()
<< "Def-use dependence cannot have non-zero distance.\n"
<< "On operation: " << *dep.getDestination() << ".\n";
return success();
}
LogicalResult ChainingCyclicProblem::check() {
for (auto *op : getOperations())
for (auto &dep : getDependences(op))
if (failed(checkDefUse(dep)))
return failure();
if (ChainingProblem::check().succeeded() &&
CyclicProblem::check().succeeded())
return success();
return failure();
}
LogicalResult ChainingCyclicProblem::verify() {
if (ChainingProblem::verify().succeeded() &&
CyclicProblem::verify().succeeded())
return success();
return failure();
}
//===----------------------------------------------------------------------===//
// Dependence
//===----------------------------------------------------------------------===//

View File

@ -290,6 +290,30 @@ public:
LogicalResult schedule() override;
};
// This class solves the resource-free `ChainingCyclicProblem` by relying on
// pre-computed chain-breaking constraints. The optimal initiation interval (II)
// is determined as a side product of solving the parametric problem, and
// corresponds to the "RecMII" (= recurrence-constrained minimum II) usually
// considered as one component in the lower II bound used by modulo schedulers.
class ChainingCyclicSimplexScheduler : public SimplexSchedulerBase {
private:
ChainingCyclicProblem &prob;
float cycleTime;
protected:
Problem &getProblem() override { return prob; }
void fillConstraintRow(SmallVector<int> &row,
Problem::Dependence dep) override;
void fillAdditionalConstraintRow(SmallVector<int> &row,
Problem::Dependence dep) override;
public:
ChainingCyclicSimplexScheduler(ChainingCyclicProblem &prob, Operation *lastOp,
float cycleTime)
: SimplexSchedulerBase(lastOp), prob(prob), cycleTime(cycleTime) {}
LogicalResult schedule() override;
};
} // anonymous namespace
//===----------------------------------------------------------------------===//
@ -1254,6 +1278,55 @@ LogicalResult ChainingSimplexScheduler::schedule() {
return success();
}
//===----------------------------------------------------------------------===//
// ChainingCyclicSimplexScheduler
//===----------------------------------------------------------------------===//
void ChainingCyclicSimplexScheduler::fillConstraintRow(
SmallVector<int> &row, Problem::Dependence dep) {
SimplexSchedulerBase::fillConstraintRow(row, dep);
if (auto dist = prob.getDistance(dep))
row[parameterTColumn] = *dist;
}
void ChainingCyclicSimplexScheduler::fillAdditionalConstraintRow(
SmallVector<int> &row, Problem::Dependence dep) {
fillConstraintRow(row, dep);
// One _extra_ time step breaks the chain (note that the latency is negative
// in the tableau).
row[parameter1Column] -= 1;
}
LogicalResult ChainingCyclicSimplexScheduler::schedule() {
if (failed(checkLastOp()) || failed(computeChainBreakingDependences(
prob, cycleTime, additionalConstraints)))
return failure();
parameterS = 0;
parameterT = 1;
buildTableau();
LLVM_DEBUG(dbgs() << "Initial tableau:\n"; dumpTableau());
if (failed(solveTableau()))
return prob.getContainingOp()->emitError() << "problem is infeasible";
LLVM_DEBUG(dbgs() << "Final tableau:\n"; dumpTableau();
dbgs() << "Optimal solution found with II = " << parameterT
<< " and start time of last operation = "
<< -getParametricConstant(0) << '\n');
prob.setInitiationInterval(parameterT);
for (auto *op : prob.getOperations())
prob.setStartTime(op, getStartTime(startTimeVariables[op]));
auto filledIn = computeStartTimesInCycle(prob);
assert(succeeded(filledIn));
(void)filledIn;
return success();
}
//===----------------------------------------------------------------------===//
// Public API
//===----------------------------------------------------------------------===//
@ -1286,3 +1359,9 @@ LogicalResult scheduling::scheduleSimplex(ChainingProblem &prob,
ChainingSimplexScheduler simplex(prob, lastOp, cycleTime);
return simplex.schedule();
}
LogicalResult scheduling::scheduleSimplex(ChainingCyclicProblem &prob,
Operation *lastOp, float cycleTime) {
ChainingCyclicSimplexScheduler simplex(prob, lastOp, cycleTime);
return simplex.schedule();
}

View File

@ -0,0 +1,12 @@
// RUN: circt-opt %s -ssp-roundtrip=verify -verify-diagnostics -split-input-file
// expected-error@+1 {{Def-use dependence cannot have non-zero distance.}}
ssp.instance @defUse_distance of "ChainingCyclicProblem" {
library {
operator_type @_0 [latency<0>, incDelay<1.0>, outDelay<1.0>]
}
graph {
%0 = ssp.operation<@_0> ()
%1 = ssp.operation<@_0> (%0 [dist<3>])
}
}

View File

@ -0,0 +1,177 @@
// RUN: circt-opt %s -ssp-roundtrip=verify
// RUN: circt-opt %s -ssp-schedule="scheduler=simplex options=cycle-time=5.0" | FileCheck %s -check-prefixes=CHECK,SIMPLEX
// test from cyclic-problems.mlir
// CHECK-LABEL: cyclic
// SIMPLEX-SAME: [II<2>]
ssp.instance @cyclic of "ChainingCyclicProblem" [II<2>] {
library {
operator_type @_0 [latency<0>, incDelay<0.0>, outDelay<0.0>]
operator_type @_1 [latency<1>, incDelay<0.0>, outDelay<0.0>]
operator_type @_2 [latency<2>, incDelay<0.0>, outDelay<0.0>]
}
graph {
%0 = operation<@_1>() [t<0>, z<0.000000e+00 : f32>]
%1 = operation<@_0>(@op4 [dist<1>]) [t<1>, z<0.000000e+00 : f32>]
%2 = operation<@_2>(@op4 [dist<2>]) [t<0>, z<0.000000e+00 : f32>]
%3 = operation<@_1>(%1, %2) [t<2>, z<0.000000e+00 : f32>]
%4 = operation<@_1> @op4(%2, %0) [t<2>, z<0.000000e+00 : f32>]
operation<@_1> @last(%4) [t<3>, z<0.000000e+00 : f32>]
// SIMPLEX: operation<@_1> @last(%{{.*}}) [t<3>, z<0.000000e+00 : f32>]
}
}
// CHECK-LABEL: mobility
// SIMPLEX-SAME: [II<3>]
ssp.instance @mobility of "ChainingCyclicProblem" [II<3>] {
library {
operator_type @_1 [latency<1>, incDelay<0.0>, outDelay<0.0>]
operator_type @_4 [latency<4>, incDelay<0.0>, outDelay<0.0>]
}
graph {
%0 = operation<@_1>() [t<0>, z<0.000000e+00 : f32>]
%1 = operation<@_4>(%0) [t<1>, z<0.000000e+00 : f32>]
%2 = operation<@_1>(%0, @op5 [dist<1>]) [t<4>, z<0.000000e+00 : f32>]
%3 = operation<@_1>(%1, %2) [t<5>, z<0.000000e+00 : f32>]
%4 = operation<@_4>(%3) [t<6>, z<0.000000e+00 : f32>]
%5 = operation<@_1> @op5(%3) [t<6>, z<0.000000e+00 : f32>]
operation<@_1> @last(%4, %5) [t<10>, z<0.000000e+00 : f32>]
// SIMPLEX: @last(%{{.*}}, %{{.*}}) [t<10>, z<0.000000e+00 : f32>]
}
}
// CHECK-LABEL: interleaved_cycles
// SIMPLEX-SAME: [II<4>]
ssp.instance @interleaved_cycles of "ChainingCyclicProblem" [II<4>] {
library {
operator_type @_1 [latency<1>, incDelay<0.0>, outDelay<0.0>]
operator_type @_10 [latency<10>, incDelay<0.0>, outDelay<0.0>]
}
graph {
%0 = operation<@_1>() [t<0>, z<0.000000e+00 : f32>]
%1 = operation<@_10>(%0) [t<1>, z<0.000000e+00 : f32>]
%2 = operation<@_1>(%0, @op6 [dist<2>]) [t<10>, z<0.000000e+00 : f32>]
%3 = operation<@_1>(%1, %2) [t<11>, z<0.000000e+00 : f32>]
%4 = operation<@_10>(%3) [t<12>, z<0.000000e+00 : f32>]
%5 = operation<@_1>(%3, @op9 [dist<2>]) [t<16>, z<0.000000e+00 : f32>]
%6 = operation<@_1> @op6(%5) [t<17>, z<0.000000e+00 : f32>]
%7 = operation<@_1>(%4, %6) [t<22>, z<0.000000e+00 : f32>]
%8 = operation<@_10>(%7) [t<23>, z<0.000000e+00 : f32>]
%9 = operation<@_1> @op9(%7) [t<23>, z<0.000000e+00 : f32>]
operation<@_1> @last(%8, %9) [t<33>, z<0.000000e+00 : f32>]
// SIMPLEX: @last(%{{.*}}, %{{.*}}) [t<33>, z<0.000000e+00 : f32>]
}
}
// CHECK-LABEL: self_arc
// SIMPLEX-SAME: [II<3>]
ssp.instance @self_arc of "ChainingCyclicProblem" [II<3>] {
library {
operator_type @_1 [latency<1>, incDelay<0.0>, outDelay<0.0>]
operator_type @_3 [latency<3>, incDelay<0.0>, outDelay<0.0>]
}
graph {
%0 = operation<@_1>() [t<0>, z<0.000000e+00 : f32>]
%1 = operation<@_3> @op1(%0, @op1 [dist<1>]) [t<1>, z<0.000000e+00 : f32>]
%2 = operation<@_1> @last(%1) [t<4>, z<0.000000e+00 : f32>]
// SIMPLEX: operation<@_1> @last(%{{.*}}) [t<4>, z<0.000000e+00 : f32>]
}
}
// test from chaining-problems.mlir
// CHECK-LABEL: adder_chain
ssp.instance @adder_chain of "ChainingCyclicProblem" [II<1>] {
library {
operator_type @_0 [latency<0>, incDelay<2.34>, outDelay<2.34>]
operator_type @_1 [latency<1>, incDelay<0.0>, outDelay<0.0>]
}
graph {
%0 = operation<@_0>() [t<0>, z<0.0>]
%1 = operation<@_0>(%0) [t<0>, z<2.34>]
%2 = operation<@_0>(%1) [t<0>, z<4.68>]
%3 = operation<@_0>(%2) [t<1>, z<0.0>]
%4 = operation<@_0>(%3) [t<1>, z<2.34>]
operation<@_1> @last(%4) [t<2>, z<0.0>]
// SIMPLEX: @last(%{{.*}}) [t<2>,
}
}
// CHECK-LABEL: multi_cycle
ssp.instance @multi_cycle of "ChainingCyclicProblem" [II<1>] {
library {
operator_type @_0 [latency<0>, incDelay<2.34>, outDelay<2.34>]
operator_type @_1 [latency<1>, incDelay<0.0>, outDelay<0.0>]
operator_type @_3 [latency<3>, incDelay<2.5>, outDelay<3.75>]
}
graph {
%0 = operation<@_0>() [t<0>, z<0.0>]
%1 = operation<@_0>(%0) [t<0>, z<2.34>]
%2 = operation<@_3>(%1, %0) [t<1>, z<0.00>]
%3 = operation<@_0>(%2, %1) [t<5>, z<0.00>]
%4 = operation<@_0>(%3, %2) [t<5>, z<2.34>]
operation<@_1> @last(%4) [t<5>, z<4.68>]
// SIMPLEX: @last(%{{.*}}) [t<5>,
}
}
// CHECK-LABEL: mco_outgoing_delays
ssp.instance @mco_outgoing_delays of "ChainingCyclicProblem" [II<1>] {
library {
operator_type @_2 [latency<2>, incDelay<0.1>, outDelay<0.1>]
operator_type @_3 [latency<3>, incDelay<5.0>, outDelay<0.1>]
}
// SIMPLEX: graph
graph {
// SIMPLEX-NEXT: [t<0>, z<0.000000e+00 : f32>]
%0 = operation<@_2>() [t<0>, z<0.0>]
// Next op cannot start in cycle 2 due to %0's outgoing delay: 0.1+5.0 > 5.0.
// SIMPLEX-NEXT: [t<3>, z<0.000000e+00 : f32>]
%1 = operation<@_3>(%0) [t<3>, z<0.0>]
// SIMPLEX-NEXT: [t<6>, z<1.000000e-01 : f32>]
%2 = operation<@_2>(%1) [t<6>, z<0.1>]
// Next op should have SITC=0.1 (not: 0.2), because we only consider %2's outgoing delay.
// SIMPLEX-NEXT: [t<8>, z<1.000000e-01 : f32>]
operation<@_2> @last(%2) [t<8>, z<0.1>]
}
}
// custom tests
// CHECK-LABEL: chaining_and_cyclic
// SIMPLEX-SAME: [II<2>]
ssp.instance @chaining_and_cyclic of "ChainingCyclicProblem" [II<2>] {
library {
operator_type @_0 [latency<0>, incDelay<1.0>, outDelay<1.0>]
operator_type @_1 [latency<0>, incDelay<3.0>, outDelay<3.0>]
operator_type @_2 [latency<1>, incDelay<0.0>, outDelay<0.75>]
operator_type @_3 [latency<0>, incDelay<3.5>, outDelay<3.5>]
operator_type @_4 [latency<1>, incDelay<1.2>, outDelay<1.2>]
operator_type @_5 [latency<0>, incDelay<3.8>, outDelay<3.8>]
}
graph {
%0 = operation<@_0>(@op2 [dist<1>]) [t<0>, z<0.0>]
%1 = operation<@_1>(%0) [t<0>, z<1.0>]
%2 = operation<@_2> @op2(%0, @op4 [dist<1>]) [t<0>, z<1.0>]
%3 = operation<@_3>(%0) [t<0>, z<1.0>]
%4 = operation<@_4> @op4(%1, %3) [t<1>, z<0.0>]
// SIMPLEX: @last(%{{.*}}) [t<2>, z<1.200000e+00 : f32>]
%5 = operation<@_5> @last(%4, %2) [t<2>, z<1.2>]
}
}
// CHECK-LABEL: backedge_delay_propagation
// SIMPLEX-SAME: [II<1>]
ssp.instance @backedge_delay_propagation of "ChainingCyclicProblem" [II<1>] {
library {
operator_type @_0 [latency<0>, incDelay<1.0>, outDelay<1.0>]
}
graph {
// SIMPLEX: @last(@last [dist<1>]) [t<0>, z<0.000000e+00 : f32>]
%0 = operation<@_0> @last(@last [dist<1>]) [t<0>, z<0.000000e+00 : f32>]
}
}