Commit Graph

61 Commits

Author SHA1 Message Date
Alexander Belyaev e50261657f Fix 'the the' typo.
PiperOrigin-RevId: 281501234
2019-11-20 05:38:14 -08:00
Andy Davis 68a8da4a93 Fix Affine Loop Fusion test case reported on github.
This CL utilizies the more robust fusion feasibility analysis being built out in LoopFusionUtils, which will eventually be used to replace the current affine loop fusion pass.

PiperOrigin-RevId: 281112340
2019-11-18 11:20:37 -08:00
Kazuaki Ishizaki f28c5aca17 Fix minor spelling tweaks (NFC)
Closes tensorflow/mlir#175

PiperOrigin-RevId: 275726876
2019-10-20 09:44:36 -07:00
Diego Caballero 3451055614 Add support for some multi-store cases in affine fusion
This PR is a stepping stone towards supporting generic multi-store
source loop nests in affine loop fusion. It extends the algorithm to
support fusion of multi-store loop nests that:
 1. have only one store that writes to a function-local live out, and
 2. the remaining stores are involved in loop nest self dependences
    or no dependences within the function.

Closes tensorflow/mlir#162

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/162 from dcaballe:dcaballe/multi-output-fusion 7fb7dec6fe8b45f5ce176f018bfe37b256420c45
PiperOrigin-RevId: 273773907
2019-10-09 10:37:30 -07:00
Uday Bondhugula 4bb6f8ecdb Extend map canonicalization to propagate constant operands
- extend canonicalizeMapAndOperands to propagate constant operands into
  the map's expressions (and thus drop those operands).
- canonicalizeMapAndOperands previously only dropped duplicate and
  unused operands; however, operands that were constants were
  retained.

This change makes IR maps/expressions generated by various
utilities/passes even simpler; also makes some of the test checks more
accurate and simpler -- for eg., 0' instead of symbol(%{{.*}}).

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#107

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/107 from bondhugula:canonicalize-maps c889a51486d14fbf7db489f224f881e7e1ff7d72
PiperOrigin-RevId: 266085289
2019-08-29 01:13:29 -07:00
Alex Zinenko 480d68f8de Affine loop parallelism detection: conservatively handle unknown ops
The loop parallelism detection utility only collects the affine.load and
affine.store operations appearing inside the loop to analyze the access
patterns for the absence of dependences.  However, any operation, including
unregistered operations, can appear in a body of an affine loop.  If such
operation has side effects, the result of parallelism analysis is incorrect.
Conservatively assume affine loops are not parallel in presence of operations
other than affine.load, affine.store, affine.for, affine.terminator that may
have side effects.

This required to update the loop-fusion unit test that relies on parallelism
analysis and was exercising loop fusion in presence of an unregistered
operation.

PiperOrigin-RevId: 259560935
2019-07-23 10:18:46 -07:00
River Riddle 89bc449cee Standardize the value numbering in the AsmPrinter.
Change the AsmPrinter to number values breadth-first so that values in adjacent regions can have the same name. This allows for ModuleOp to contain operations that produce results. This also standardizes the special name of region entry arguments to "arg[0-9+]" now that Functions are also operations.

PiperOrigin-RevId: 257225069
2019-07-09 10:41:00 -07:00
Andy Davis 2e1187dd25 Globally change load/store/dma_start/dma_wait operations over to affine.load/store/dma_start/dma_wait.
In most places, this is just a name change (with the exception of affine.dma_start swapping the operand positions of its tag memref and num_elements operands).
Significant code changes occur here:
*) Vectorization: LoopAnalysis.cpp, Vectorize.cpp
*) Affine Transforms: Transforms/Utils/Utils.cpp

PiperOrigin-RevId: 256395088
2019-07-03 14:37:06 -07:00
River Riddle 679a3b4191 Change the attribute dictionary syntax to separate name and value with '='.
The current syntax separates the name and value with ':', but ':' is already overloaded by several other things(e.g. trailing types). This makes the syntax difficult to parse in some situtations:

Old:
  "foo: 10 : i32"

New:
  "foo = 10 : i32"
PiperOrigin-RevId: 255097928
2019-06-25 19:06:34 -07:00
Geoffrey Martin-Noble fd99b6ce97 Remove unnecessary -verify-diagnostics
These were likely added in error because of confusion about the flag when it was just called "-verify". The extra flag doesn't cause much harm, but it does make mlir-opt do more work and clutter the RUN line

PiperOrigin-RevId: 254037016
2019-06-19 23:08:13 -07:00
Geoffrey Martin-Noble d7d69569e7 Rename -verify mlir-opt flag to -verify-expected-diagnostics
This name has caused some confusion because it suggests that it's running op verification (and that this verification isn't getting run by default).

PiperOrigin-RevId: 254035268
2019-06-19 23:08:03 -07:00
Nicolas Vasilache 258e8d9ce2 Prepend an "affine-" prefix to Affine pass option names - NFC
Trying to activate both LLVM and MLIR passes in mlir-cpu-runner showed name collisions when registering pass names.
    One possible way of disambiguating that should also work across dialects is to prepend the dialect name to the passes that specifically operate on that dialect.

    With this CL, mlir-cpu-runner tests still run when both LLVM and MLIR passes are registered

--

PiperOrigin-RevId: 246539917
2019-05-06 08:26:44 -07:00
River Riddle 6fa3181329 Remove the non-postorder walk functions from Function/Block/Instruction and rename walkPostOrder to walk.
--

PiperOrigin-RevId: 241965239
2019-04-05 07:41:23 -07:00
Andy Davis 7c1fc9e795 Enable producer-consumer fusion for liveout memrefs if consumer read region matches producer write region.
--

PiperOrigin-RevId: 241517207
2019-04-02 13:39:50 -07:00
MLIR Team 9d30b36aaf Enable input-reuse fusion to search function arguments for fusion candidates (takes care of a TODO, enables another tutorial test case).
PiperOrigin-RevId: 240979894
2019-03-29 17:54:36 -07:00
MLIR Team 9d9675fc8f Remove overly conservative check in LoopFusion pass (enables fusion in tutorial example).
PiperOrigin-RevId: 240859227
2019-03-29 17:51:16 -07:00
River Riddle 832567b379 NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for' and set the namespace of the AffineOps dialect to 'affine'.
PiperOrigin-RevId: 240165792
2019-03-29 17:39:03 -07:00
River Riddle 9c6e92360c NFC: Rename the 'if' operation in the AffineOps dialect to 'affine.if'.
PiperOrigin-RevId: 240071154
2019-03-29 17:36:53 -07:00
MLIR Team c1ff9e866e Use FlatAffineConstraints::unionBoundingBox to perform slice bounds union for loop fusion pass (WIP).
Adds utility to convert slice bounds to a FlatAffineConstraints representation.
Adds utility to FlatAffineConstraints to promote loop IV symbol identifiers to dim identifiers.

PiperOrigin-RevId: 236973261
2019-03-29 16:59:21 -07:00
Uday Bondhugula 7e288e7c19 Add missing run command to fusion test cases - follow up to cl/236882988
PiperOrigin-RevId: 236947383
2019-03-29 16:58:50 -07:00
Uday Bondhugula b34f8d3c83 Fix and improve detectAsMod
- fix for the mod detection
- simplify/avoid the mod at construction (if the dividend is already known to be less
  than the divisor), since the information is available at hand there

PiperOrigin-RevId: 236882988
2019-03-29 16:57:36 -07:00
Uday Bondhugula a77734e185 Make sure that fusion test cases don't have out of bounds accesses
- fix out of bounds test case
- -memref-bound-check on the test/Transforms/loop-fusion.mlir no longer reports any
  errors, before or after -loop-fusion is run

PiperOrigin-RevId: 236757658
2019-03-29 16:56:35 -07:00
MLIR Team 39a1ddeb1c Adds loop attribute as a temporary work around to prevent slice fusion of loop nests containing instructions with side effects (the proper solution will be do use memref read/write regions in the future).
PiperOrigin-RevId: 236733739
2019-03-29 16:56:20 -07:00
Uday Bondhugula 12b9dece8d Bug fix for getConstantBoundOnDimSize
- this was detected when memref-bound-check was run on the output of the
  loop-fusion pass
- the addition (to represent ceildiv as a floordiv) had to be performed only
  for the constant term of the constraint
- update test cases
- memref-bound-check no longer returns an error on the output of this test case

PiperOrigin-RevId: 236731137
2019-03-29 16:56:06 -07:00
MLIR Team d038e34735 Loop fusion for input reuse.
*) Breaks fusion pass into multiple sub passes over nodes in data dependence graph:
- first pass fuses single-use producers into their unique consumer.
- second pass enables fusing for input-reuse by fusing sibling nodes which read from the same memref, but which do not share dependence edges.
- third pass fuses remaining producers into their consumers (Note that the sibling fusion pass may have transformed a producer with multiple uses into a single-use producer).
*) Fusion for input reuse is enabled by computing a sibling node slice using the load/load accesses to the same memref, and fusion safety is guaranteed by checking that the sibling node memref write region (to a different memref) is preserved.
*) Enables output vector and output matrix computations from KFAC patches-second-moment operation to fuse into a single loop nest and reuse input from the image patches operation.
*) Adds a generic loop utilitiy for finding all sequential loops in a loop nest.
*) Adds and updates unit tests.

PiperOrigin-RevId: 236350987
2019-03-29 16:52:35 -07:00
MLIR Team c2766f3760 Fix bug in memref region computation with slice loop bounds. Adds loop IV values to ComputationSliceState which are used in FlatAffineConstraints::addSliceBounds, to ensure that constraints are only added for loop IV values which are present in the constraint system.
PiperOrigin-RevId: 235952912
2019-03-29 16:47:29 -07:00
Uday Bondhugula a1dad3a5d9 Extend/improve getSliceBounds() / complete TODO + update unionBoundingBox
- compute slices precisely where the destination iteration depends on multiple source
  iterations (instead of over-approximating to the whole source loop extent)
- update unionBoundingBox to deal with input with non-matching symbols
- reenable disabled backend test case

PiperOrigin-RevId: 234714069
2019-03-29 16:33:11 -07:00
MLIR Team 58aa383e60 Support fusing producer loop nests which write to a memref which is live out, provided that the write region of the consumer loop nest to the same memref is a super set of the producer's write region.
PiperOrigin-RevId: 234240958
2019-03-29 16:30:11 -07:00
MLIR Team 8f5f2c765d LoopFusion: perform a series of loop interchanges to increase the loop depth at which slices of producer loop nests can be fused into constumer loop nests.
*) Adds utility to LoopUtils to perform loop interchange of two AffineForOps.
*) Adds utility to LoopUtils to sink a loop to a specified depth within a loop nest, using a series of loop interchanges.
*) Computes dependences between all loads and stores in the loop nest, and classifies each loop as parallel or sequential.
*) Computes loop interchange permutation required to sink sequential loops (and raise parallel loop nests) while preserving relative order among them.
*) Checks each dependence against the permutation to make sure that dependences would not be violated by the loop interchange transformation.
*) Calls loop interchange in LoopFusion pass on consumer loop nests before fusing in producers, sinking loops with loop carried dependences deeper into the consumer loop nest.
*) Adds and updates related unit tests.

PiperOrigin-RevId: 234158370
2019-03-29 16:29:26 -07:00
MLIR Team affb2193cc Update direction vector computation to use FlatAffineConstraints::getLower/UpperBounds.
Update FlatAffineConstraints::getLower/UpperBounds to project to the identifier for which bounds are being computed. This change enables computing bounds on an identifier which were previously dependent on the bounds of another identifier.

PiperOrigin-RevId: 234017514
2019-03-29 16:28:25 -07:00
Uday Bondhugula c419accea3 Automated rollback of changelist 232728977.
PiperOrigin-RevId: 232944889
2019-03-29 16:21:38 -07:00
Uday Bondhugula 4ba8c9147d Automated rollback of changelist 232717775.
PiperOrigin-RevId: 232807986
2019-03-29 16:19:33 -07:00
River Riddle fd2d7c857b Rename the 'if' operation in the AffineOps dialect to 'affine.if' and namespace
the AffineOps dialect with 'affine'.

PiperOrigin-RevId: 232728977
2019-03-29 16:18:59 -07:00
River Riddle 90d10b4e00 NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for'. The is the second step to adding a namespace to the AffineOps dialect.
PiperOrigin-RevId: 232717775
2019-03-29 16:17:59 -07:00
River Riddle 3227dee15d NFC: Rename affine_apply to affine.apply. This is the first step to adding a namespace to the affine dialect.
PiperOrigin-RevId: 232707862
2019-03-29 16:17:29 -07:00
MLIR Team a78edcda5b Loop fusion improvements:
*) After a private memref buffer is created for a fused loop nest, dependences on the old memref are reduced, which can open up fusion opportunities. In these cases, users of the old memref are added back to the worklist to be reconsidered for fusion.
*) Fixed a bug in fusion insertion point dependence check where the memref being privatized was being skipped from the check.

PiperOrigin-RevId: 232477853
2019-03-29 16:13:50 -07:00
MLIR Team d7c824451f LoopFusion: insert the source loop nest slice at a depth in the destination loop nest which preserves dependences (above any loop carried or other dependences). This is accomplished by updating the maximum destination loop depth based on dependence checks between source loop nest loads and stores which access the memref on which the source loop nest has a store op. In addition, prevent fusing in source loop nests which write to memrefs which escape or are live out.
PiperOrigin-RevId: 231684492
2019-03-29 16:03:23 -07:00
River Riddle a642bb1779 Update tests using affine maps to not rely on specific map numbers in the output IR. This is necessary to remove the dependency on ForInst not numbering the AffineMap bounds it has custom formatting for.
PiperOrigin-RevId: 231634812
2019-03-29 16:03:08 -07:00
MLIR Team a0f3db4024 Support fusing loop nests which require insertion into a new instruction Block position while preserving dependences, opening up additional fusion opportunities.
- Adds SSA Value edges to the data dependence graph used in the loop fusion pass.

PiperOrigin-RevId: 231417649
2019-03-29 16:00:04 -07:00
River Riddle 755538328b Recommit: Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst.
PiperOrigin-RevId: 231342063
2019-03-29 15:59:30 -07:00
Nicolas Vasilache ae772b7965 Automated rollback of changelist 231318632.
PiperOrigin-RevId: 231327161
2019-03-29 15:42:38 -07:00
River Riddle 5ecef2b3f6 Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst.
PiperOrigin-RevId: 231318632
2019-03-29 15:42:08 -07:00
Uday Bondhugula b4a1443508 Update replaceAllMemRefUsesWith to generate single result affine_apply's for
index remapping
- generate a sequence of single result affine_apply's for the index remapping
  (instead of one multi result affine_apply)
- update dma-generate and loop-fusion test cases; while on this, change test cases
  to use single result affine apply ops
- some fusion comment fix/cleanup

PiperOrigin-RevId: 230985830
2019-03-29 15:38:23 -07:00
MLIR Team b28009b681 Fix single producer check in loop fusion pass.
PiperOrigin-RevId: 230565482
2019-03-29 15:32:20 -07:00
Uday Bondhugula 864d9e02a1 Update fusion cost model + some additional infrastructure and debug information for -loop-fusion
- update fusion cost model to fuse while tolerating a certain amount of redundant
  computation; add cl option -fusion-compute-tolerance
  evaluate memory footprint and intermediate memory reduction
- emit debug info from -loop-fusion showing what was fused and why
- introduce function to compute memory footprint for a loop nest
- getMemRefRegion readability update - NFC

PiperOrigin-RevId: 230541857
2019-03-29 15:32:06 -07:00
Uday Bondhugula 94a03f864f Allocate private/local buffers for slices accurately during fusion
- the size of the private memref created for the slice should be based on
  the memref region accessed at the depth at which the slice is being
  materialized, i.e., symbolic in the outer IVs up until that depth, as opposed
  to the region accessed based on the entire domain.

- leads to a significant contraction of the temporary / intermediate memref
  whenever the memref isn't reduced to a single scalar (through store fwd'ing).

Other changes

- update to promoteIfSingleIteration - avoid introducing unnecessary identity
  map affine_apply from IV; makes it much easier to write and read test cases
  and pass output for all passes that use promoteIfSingleIteration; loop-fusion
  test cases become much simpler

- fix replaceAllMemrefUsesWith bug that was exposed by the above update -
  'domInstFilter' could be one of the ops erased due to a memref replacement in
  it.

- fix getConstantBoundOnDimSize bug: a division by the coefficient of the identifier was
  missing (the latter need not always be 1); add lbFloorDivisors output argument

- rename getBoundingConstantSizeAndShape -> getConstantBoundingSizeAndShape

PiperOrigin-RevId: 230405218
2019-03-29 15:30:31 -07:00
MLIR Team 71495d58a7 Handle escaping memrefs in loop fusion pass:
*) Do not remove loop nests which write to memrefs which escape the function.
*) Do not remove memrefs which escape the function (e.g. are used in the return instruction).

PiperOrigin-RevId: 230398630
2019-03-29 15:30:14 -07:00
Uday Bondhugula d7522eb264 Fix test cases that were accessing out of bounds to start with (b/123072438)
- detected with memref-bound-check

- fixes b/123072438; while on this, fix another test case which was reported
  out of bounds

PiperOrigin-RevId: 229978187
2019-03-29 15:27:29 -07:00
MLIR Team c4237ae990 LoopFusion: Creates private MemRefs which are used only by operations in the fused loop.
*) Enables reduction of private memref size based on MemRef region accessed by fused slice.
*) Enables maximal fusion by creating a private memref to break a fusion-preventing dependence.
*) Adds maximal fusion flag to enable fusing as much as possible (though it still fuses the minimum cost computation slice).

PiperOrigin-RevId: 229936698
2019-03-29 15:26:15 -07:00
Uday Bondhugula 40f7535571 Update stale / target-specific information in comments - NFC
PiperOrigin-RevId: 229800834
2019-03-29 15:25:29 -07:00