Commit Graph

1138 Commits

Author SHA1 Message Date
Nicolas Vasilache a89d8c0a1a Port Tablegen'd reference implementation of Add to declarative builders.
PiperOrigin-RevId: 238977252
2019-03-29 17:22:36 -07:00
Nicolas Vasilache 3a12bc5041 Remove LOAD/STORE/RETURN boilerplate in declarative builders.
This CL introduces a ValueArrayHandle helper to manage the implicit conversion
of ArrayRef<ValueHandle> -> ArrayRef<Value*> by converting first to ValueArrayHandle.
Without this, boilerplate operations that take ArrayRef<Value*> cannot be removed easily.

This all seems to boil down to decoupling Value from Type.
Alternative solutions exist (e.g. MLIR using Value by value everywhere) but they would be very intrusive. This seems to be the lowest impedance change.

Intrinsics are also lowercased by popular demand.

PiperOrigin-RevId: 238974125
2019-03-29 17:22:20 -07:00
Nicolas Vasilache f43388e4ce Port LowerVectorTransfers from EDSC + AST to declarative builders
This CL removes the dependency of LowerVectorTransfers on the AST version of EDSCs which will be retired.

This exhibited a pretty fundamental staging difference in AST-based vs declarative based emission.

Since the delayed creation with an AST was staged, the loop order came into existence after the clipping expressions were computed.
This now changes as the loops first need to be created declaratively in fixed order and then the clipping expressions are created.
Also, due to lack of staging, coalescing cannot be done on the fly anymore and
needs to be done either as a pre-pass (current implementation) or as a local transformation on the generated IR (future work).

Tests are updated accordingly.

PiperOrigin-RevId: 238971631
2019-03-29 17:22:06 -07:00
River Riddle 6810c8bdc1 Moving the IR printing and execution timing options out of mlir-opt and into lib/Pass. We now expose two methods: registerPassManagerCLOptions and applyPassManagerCLOptions; to allow for multiple different users (mlir-opt, etc.) to opt-in to this common functionality.
PiperOrigin-RevId: 238836911
2019-03-29 17:21:50 -07:00
River Riddle 8e7b683d1f Replace the usages of llvm::Timer in PassTiming in favor of a simple nested Timer. The output view is simplified to just display the Wall Time. This new infrastructure will greatly simplify the amount of work needed to support multi-threaded execution timing.
PiperOrigin-RevId: 238819218
2019-03-29 17:21:34 -07:00
Jacques Pienaar cc5657343d Move getSuccessorOperandIndex out of line.
PiperOrigin-RevId: 238814769
2019-03-29 17:21:19 -07:00
Jacques Pienaar 173457cbea Add numeric include for using std::accumulate
PiperOrigin-RevId: 238664219
2019-03-29 17:21:03 -07:00
River Riddle 27d1bb920e Cache the simplified attributes in SimplifyAffineStructures to avoid redundant simplifications, as well as unnecessary accesses to the MLIRContext.
PiperOrigin-RevId: 238654325
2019-03-29 17:20:46 -07:00
Jacques Pienaar cdd56eb675 Qualify DenseMap in AnalysisManager.
PiperOrigin-RevId: 238649794
2019-03-29 17:20:29 -07:00
Jacques Pienaar 14489b5a8a Remove unnecessary headers from mlir-opt.
PiperOrigin-RevId: 238639013
2019-03-29 17:20:12 -07:00
River Riddle 076a7350e2 Add an instrumentation for conditionally printing the IR before and after pass execution. This instrumentation can be added directly to the PassManager via 'enableIRPrinting'. mlir-opt exposes access to this instrumentation via the following flags:
* print-ir-before=(comma-separated-pass-list)
  - Print the IR before each of the passes provided within the pass list.
* print-ir-before-all
  - Print the IR before every pass in the pipeline.
* print-ir-after=(comma-separated-pass-list)
  - Print the IR after each of the passes provided within the pass list.
* print-ir-after-all
  - Print the IR after every pass in the pipeline.
* print-ir-module-scope
  - Always print the Module IR, even for non module passes.

PiperOrigin-RevId: 238523649
2019-03-29 17:19:57 -07:00
River Riddle 087e599a3f Rename allocator to identifierAllocator and add an identifierMutex to make identifier uniquing thread safe. This also adds a general purpose 'contextMutex' to protect access to the rest of the miscellaneous parts of the MLIRContext, e.g. diagnostics, dialect registration, etc. This is step 5/5 of making the MLIRContext thread-safe.
PiperOrigin-RevId: 238516697
2019-03-29 17:19:42 -07:00
River Riddle c769f6b985 Give the Location classes their own SmartRWMutex and make sure that they are using the locationAllocator. This is step 4/N to making MLIRContext thread-safe.
PiperOrigin-RevId: 238516646
2019-03-29 17:19:27 -07:00
River Riddle fd6c94dc8f Give the affine structures, AffineMap/AffineExpr/IntegerSet/etc, their own BumpPtrAllocator and SmartMutex to make them thread-safe. This is step 3/N to making MLIRContext thread-safe.
PiperOrigin-RevId: 238516596
2019-03-29 17:19:11 -07:00
River Riddle 92a8a7115b Give Attributes their own BumpPtrAllocator and SmartRWMutex to make them thread-safe. This is step 2/N to making the MLIRContext thread-safe.
PiperOrigin-RevId: 238516542
2019-03-29 17:18:55 -07:00
River Riddle 9942d41e3b Add an 'Instruction::create' overload that accepts an existing NamedAttributeList. This avoids the need to unique an attribute list if one already exists, e.g. when cloning an existing instruction.
PiperOrigin-RevId: 238512499
2019-03-29 17:18:38 -07:00
River Riddle e472f5b3d9 Optimize the implementation of AffineExprConstantFolder to avoid the redundant creation of IntegerAttrs and IndexType. This becomes a much bigger performance issue when MLIRContext is thread-safe; as each unnecessary call may need to lock a mutex.
PiperOrigin-RevId: 238484632
2019-03-29 17:18:22 -07:00
Alex Zinenko 276fae1b0d Rename BlockList into Region
NFC.  This is step 1/n to specifying regions as parts of any operation.

PiperOrigin-RevId: 238472370
2019-03-29 17:18:04 -07:00
Alex Zinenko 80e38b6204 Python bindings: expose boolean and comparison operators
In particular, expose comparison operators as Python operator overloads on
ValueHandles.  The comparison currently emits signed integer comparisons only,
which is compatible with the behavior of emitter-based EDSC interface.  This is
sub-optimal and must be reconsidered in the future.  Note that comparison
operators are not overloaded in the C++ declarative builder API precisely
because this avoids the premature decision on the signedness of comparisons.

Implement the declarative construction of boolean expressions using
ValueHandles by overloading the boolean operators in the `op` namespace to
differentiate between `operator!` for nullity check and for boolean negation.
The operands must be of i1 type.  Also expose boolean operations as Python
operator overloads on ValueHandles.

PiperOrigin-RevId: 238421615
2019-03-29 17:17:47 -07:00
Alex Zinenko e904ddf315 Python bindings: expose various Ops through declarative builders
In particular, expose `cond_br`, `select` and `call` operations with syntax
similar to that of the previous emitter-based EDSC interface.  These are
provided for backwards-compatibility.  Ideally, we want them to be
Table-generated from the Op definitions when those definitions are declarative.

Additionally, expose the ability to construct any op given its canonical name,
which also exercises the construction of unregistered ops.

PiperOrigin-RevId: 238421583
2019-03-29 17:17:27 -07:00
Alex Zinenko 269d9bf54e Python bindings: expose IndexedValue
Expose edsc::IndexedValue using a syntax smilar to that of edsc::Indexed to
ensure backwards-compatibility.  It remains possible to write array-indexed
loads and stores as

    C.store([i, j], A.load([i, k]) * B.load([k, j]))

after taking a "view" of some value handle using IndexedValue as

    A = IndexedValue(originalValueHandle)

provided that all indices are also value handles.

PiperOrigin-RevId: 238421544
2019-03-29 17:17:12 -07:00
Alex Zinenko 48d0d1f172 Python bindings: use MLIR operations to define constant values
In the original implementation, constants could be bound to EDSC expressions in
the binder, independently from other MLIR Values.  A rework of EDSC including
early typing provided the functionality to use MLIR's `constant` operation to
define typed constants instead of binding them separately, but only used it for
index types.  The new declarative builder implementation followed by providing
a call for building `constant` operations of index types but nothing more.
Expose similar builders for integers, floats and functions to match the what
binders allow one to use.

PiperOrigin-RevId: 238421508
2019-03-29 17:16:57 -07:00
Alex Zinenko d940c52183 Python bindings: make FunctionContext behave more like BlockContext
Provide a function `arg` that returns the function argument as a value handle,
similar to block arguments.  This makes function context managers in Python
similar to block context managers, which is more consistent given that the
function context manager sets the insertion point to the first block of the
function and that arguments of that block are those of the function.  This
prepares the removal of PythonMLIREmitter class and its bind_function_arguments
helper.

Additionally, provide a helper method in PythonMLIRModule to define a function
and immediately create a context for it.  Update the tests that are already
using context managers to use the function context manager instead of creating
the function manually.

PiperOrigin-RevId: 238421087
2019-03-29 17:16:42 -07:00
Uday Bondhugula e1e455f7dd Change parallelism detection test pass to emit a note
- emit a note on the loop being parallel instead of setting a loop attribute
- rename the pass -test-detect-parallel (from -detect-parallel)

PiperOrigin-RevId: 238122847
2019-03-29 17:16:27 -07:00
Uday Bondhugula a228b7d477 Change getMemoryFootprintBytes emitError to a warning
- this is really not a hard error; emit a warning instead (for inability to compute
  footprint due to the union failing due to unimplemented cases)
- remove a misleading warning from LoopFusion.cpp

PiperOrigin-RevId: 238118711
2019-03-29 17:16:12 -07:00
Feng Liu c52a812700 [TableGen] Support nested dag attributes arguments in the result pattern
Add support to create a new attribute from multiple attributes. It extended the
DagNode class to represent attribute creation dag. It also changed the
RewriterGen::emitOpCreate method to support this nested dag emit.

An unit test is added.

PiperOrigin-RevId: 238090229
2019-03-29 17:15:57 -07:00
River Riddle 6558f80c8d Refactor pass timing so that it is toggled on the passmanager via 'enableTiming'. This also makes the pipeline view the default display mode.
PiperOrigin-RevId: 238079916
2019-03-29 17:15:42 -07:00
Uday Bondhugula 9f2781e8dd Fix misc bugs / TODOs / other improvements to analysis utils
- fix for getConstantBoundOnDimSize: floordiv -> ceildiv for extent
- make getConstantBoundOnDimSize also return the identifier upper bound
- fix unionBoundingBox to correctly use the divisor and upper bound identified by
  getConstantBoundOnDimSize
- deal with loop step correctly in addAffineForOpDomain (covers most cases now)
- fully compose bound map / operands and simplify/canonicalize before adding
  dim/symbol to FlatAffineConstraints; fixes false positives in -memref-bound-check; add
  test case there
- expose mlir::isTopLevelSymbol from AffineOps

PiperOrigin-RevId: 238050395
2019-03-29 17:15:27 -07:00
River Riddle 7eee76b84c Give the TypeUniquer its own BumpPtrAllocator and a SmartRWMutex to make it thread-safe. This is step 1/N to making the MLIRContext thread-safe.
PiperOrigin-RevId: 238037814
2019-03-29 17:15:11 -07:00
River Riddle 739f3ef7ee NFC: Remove a stray print in mlir::buildTripCountMapAndOperands.
PiperOrigin-RevId: 238033349
2019-03-29 17:14:56 -07:00
Lei Zhang 372a3a52b5 [TableGen] Sort OpBase.td attribute kinds and refine some comments
This CL sorts attribute kinds in OpBase.td according to a logical order: simple
cases ahead of complicated ones. The logic of attribute kinds involved are
completely untouched.

Comments on AttrConstraint and Attr are revised slightly.

PiperOrigin-RevId: 238031275
2019-03-29 17:14:41 -07:00
Lei Zhang f0998d589b [TableGen] Add common I<n>Tensor, F<n>Tensor, and I64Attr definitions
This CL also changes IntegerAttrBase to use APInt as return value to defer bitwidth
handling to API call sites and be consistent with FloatAttrBase. Call sites are
adjusted accordingly.

PiperOrigin-RevId: 238030614
2019-03-29 17:14:27 -07:00
Uday Bondhugula 075090f891 Extend loop unrolling and unroll-jamming to non-matching bound operands and
multi-result upper bounds, complete TODOs, fix/improve test cases.

- complete TODOs for loop unroll/unroll-and-jam. Something as simple as
  "for %i = 0 to %N" wasn't being unrolled earlier (unless it had been written
  as "for %i = ()[s0] -> (0)()[%N] to %N"; addressed now.

- update/replace getTripCountExpr with buildTripCountMapAndOperands; makes it
  more powerful as it composes inputs into it

- getCleanupLowerBound and getUnrolledLoopUpperBound actually needed the same
  code; refactor and remove one.

- reorganize test cases, write previous ones better; most of these changes are
  "label replacements".

- fix wrongly labeled test cases in unroll-jam.mlir

PiperOrigin-RevId: 238014653
2019-03-29 17:14:12 -07:00
Alex Zinenko 9abea4a466 Python bindings: provide context managers for the Blocks
Expose EDSC block builders as Python context managers, similarly to loop
builders.  Note that blocks, unlike loops, are addressable and may need to be
"declared" without necessarily filling their bodies with instructions.  This is
the case, for example, when branching to a new block from the existing block.
Therefore, creating the block context manager immediately creates the block
(unless the manager captures an existing block) by creating and destroying the
block builder.  With this approach, one can either fill in the block and refer
to it later leveraging Python's dynamic variable lookup

    with BlockContext([indexType]) as b:
      op(...)  # operation inside the block
      ret()
    op(...)  # operation outside the block (in the function entry block)
    br(b, [...])    # branching to the block created above

or declare the block contexts upfront and enter them on demand

    bb1 = BlockContext()  # empty block created in the surrounding function
    bb2 = BlockContext()  # context
    cond_br(bb1.handle, [], bb2.handle, [])  # branch to blocks from here
    with bb1:
      op(...)  # operation inside the first block
    with bb2:
      op(...)  # operation inside the second block
    with bb1:
      op(...)  # append operation to the first block

Additionally, one can create multiple throw-away contexts that append to the
same block

    with BlockContext() as b:
      op(...)  # operation inside the block
    with BlockContext(appendTo(b)):
      op(...)  # new context appends to the block

which has a potential of being extended to control the insertion point of the
block at a finer level of granularity.

PiperOrigin-RevId: 238005298
2019-03-29 17:13:57 -07:00
Alex Zinenko b0cc81883c Python bindings: drop third_party/ in includes
Historically, Python bindings were using full path including third_party for
most headers but not all of them.  This is inconsistent with the rest of MLIR.
Drop the prefix path in #include directives.

PiperOrigin-RevId: 237999346
2019-03-29 17:13:42 -07:00
MLIR Team 8d62a6092f Clean up some stray mlfunc/cfgfunc leftovers.
PiperOrigin-RevId: 237936610
2019-03-29 17:13:26 -07:00
River Riddle fde5bcdae7 Add documentation for the pass instrumentation framework to the WritingAPass document.
PiperOrigin-RevId: 237919897
2019-03-29 17:13:11 -07:00
River Riddle 59b0839206 NFC: Remove old comment referencing CFG/EXT/ML functions.
PiperOrigin-RevId: 237902039
2019-03-29 17:12:56 -07:00
Nicolas Vasilache dfd904d4a9 Minor changes to the EDSC API NFC
This CL makes some minor changes to the declarative builder Helpers:
1. adds lb, ub, step methods to MemRefView to avoid always having to go through std::get + range;
2. drops MemRefView& from IndexedValue which was just creating ownership concerns. Instead, an IndexedValue only needs to keep track of the ValueHandle from which a MemRefView can be constructed on-demand if necessary.

PiperOrigin-RevId: 237861493
2019-03-29 17:12:41 -07:00
Lei Zhang e1595df1af Allow input and output to have different element types for broadcastable ops
TensorFlow comparison ops like tf.Less supports broadcast behavior but the result
type have different element types as the input types. Extend broadcastable trait
to allow such cases. Added tf.Less to demonstrate it.

PiperOrigin-RevId: 237846127
2019-03-29 17:12:26 -07:00
Lei Zhang 7972dcef84 Pull shape broadcast out as a stand-alone utility function
So that we can use this function to deduce broadcasted shapes elsewhere.

Also added support for unknown dimensions, by following TensorFlow behavior.

PiperOrigin-RevId: 237846065
2019-03-29 17:12:11 -07:00
River Riddle 0cc212f2b7 Ensure that pass timing is the last added pass instrumentation. This also updates the PassInstrumentor to iterate in reverse for the "after" hooks. This ensures that the instrumentations run in a stack like fashion.
PiperOrigin-RevId: 237840808
2019-03-29 17:11:56 -07:00
River Riddle dc141c307b Tidy up some of the pass infrastructure g3doc.
* Separate MyAnalysis into MyFunctionAnalysis/MyModuleAnalysis to avoid potential confusion.
* Add an example of an inline lambda builder for PassPipelineRegistration.
* Clarify the wording on a few of the pass restrictions.

PiperOrigin-RevId: 237840325
2019-03-29 17:11:41 -07:00
River Riddle e46ba31c66 Add a new instrumentation for timing pass and analysis execution. This is made available in mlir-opt via the 'pass-timing' and 'pass-timing-display' flags. The 'pass-timing-display' flag toggles between the different available display modes for the timing results. The current display modes are 'list' and 'pipeline', with 'list' representing the default.
Below shows the output for an example mlir-opt command line.

mlir-opt foo.mlir -verify-each=false -cse -canonicalize -cse -cse -pass-timing

list view (-pass-timing-display=list):
* In this mode the results are displayed in a list sorted by total time; with each pass/analysis instance aggregated into one unique result. This mode is similar to the output of 'time-passes' in llvm-opt.

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0097 seconds (0.0096 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0051 ( 58.3%)   0.0001 ( 12.2%)   0.0052 ( 53.8%)   0.0052 ( 53.8%)  Canonicalizer
   0.0025 ( 29.1%)   0.0005 ( 58.2%)   0.0031 ( 31.9%)   0.0031 ( 32.0%)  CSE
   0.0011 ( 12.6%)   0.0003 ( 29.7%)   0.0014 ( 14.3%)   0.0014 ( 14.2%)  DominanceInfo
   0.0087 (100.0%)   0.0009 (100.0%)   0.0097 (100.0%)   0.0096 (100.0%)  Total

pipeline view (-pass-timing-display=pipeline):
* In this mode the results are displayed in a nested pipeline view that mirrors the internal pass pipeline that is being executed in the pass manager. This view is useful for understanding specifically which parts of the pipeline are taking the most time, and can also be used to identify when analyses are being invalidated and recomputed.

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0082 seconds (0.0081 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0042 (100.0%)   0.0039 (100.0%)   0.0082 (100.0%)   0.0081 (100.0%)  Function Pipeline
   0.0005 ( 11.6%)   0.0008 ( 21.1%)   0.0013 ( 16.1%)   0.0013 ( 16.2%)    CSE
   0.0002 (  5.0%)   0.0004 (  9.3%)   0.0006 (  7.0%)   0.0006 (  7.0%)      (A) DominanceInfo
   0.0026 ( 61.8%)   0.0018 ( 45.6%)   0.0044 ( 54.0%)   0.0044 ( 54.1%)    Canonicalizer
   0.0005 ( 11.7%)   0.0005 ( 13.0%)   0.0010 ( 12.3%)   0.0010 ( 12.4%)    CSE
   0.0003 (  6.1%)   0.0003 (  8.3%)   0.0006 (  7.2%)   0.0006 (  7.1%)      (A) DominanceInfo
   0.0002 (  3.8%)   0.0001 (  2.8%)   0.0003 (  3.3%)   0.0003 (  3.3%)    CSE
   0.0042 (100.0%)   0.0039 (100.0%)   0.0082 (100.0%)   0.0081 (100.0%)  Total

PiperOrigin-RevId: 237825367
2019-03-29 17:11:25 -07:00
Mehdi Amini 732160eaa5 Move `createConvertToLLVMIRPass()` to its own header matching the target library clients need to link
PiperOrigin-RevId: 237723197
2019-03-29 17:11:07 -07:00
River Riddle 5e1f1d2cab Update the constantFold/fold API to use LogicalResult instead of bool.
PiperOrigin-RevId: 237719658
2019-03-29 17:10:50 -07:00
River Riddle 43d0ca8419 NFC: Move the PassExecutor and PassAdaptor classes into PassDetail.h so that they can be referenced throughout lib/Pass.
PiperOrigin-RevId: 237712736
2019-03-29 17:10:36 -07:00
River Riddle 0310d49f46 Move the success/failure functions out of LogicalResult and into the mlir namespace.
PiperOrigin-RevId: 237712180
2019-03-29 17:10:21 -07:00
River Riddle 2d2b40bce5 Add basic infrastructure for instrumenting pass execution and analysis computation. A virtual class, PassInstrumentation, is provided to allow for different parts of the pass manager infrastructure. The currently available hooks allow for instrumenting:
* before/after pass execution
* after a pass fails
* before/after an analysis is computed

After getting this infrastructure in place, we can start providing common developer utilities like pass timing, IR printing after pass execution, etc.

PiperOrigin-RevId: 237709692
2019-03-29 17:10:06 -07:00
Nicolas Vasilache 861eb87471 [EDSC] Cleanup declarative builder insertion point with blocks
Declarative builders want to provide the same nesting interface for blocks and loops. MLIR on the other hand has different behaviors:
1. when an AffineForOp is created the insertion point does not enter the loop body;
2. when a Block is created, the insertion point does enter the block body.

Guard against the second behavior in EDSC to make the interface unsurprising.
This also surfaces two places in the eager branch API where I was guarding against this behavior indirectly by creating a new ScopedContext.

Instead, uniformize everything to properly reset the insertion point in the unique place that builds the mlir::Block*.

PiperOrigin-RevId: 237619513
2019-03-29 17:09:51 -07:00