Commit Graph

736 Commits

Author SHA1 Message Date
Mehdi Amini 7286d43920 Introduce std.varargs attribute to mark variadic arguments functions
This is only teaching the LLVM converter to propagate the attribute onto
    the function type. MLIR will not recognize this arguments, so it would only
    be useful when calling for example `printf` with the same arguments across
    a module. Since varargs is part of the ABI lowering, this is not NFC.

--

PiperOrigin-RevId: 242382427
2019-04-07 18:22:56 -07:00
Mehdi Amini f9c4c60320 Do not mark llvm.return, llvm.br, llvm.cond_br operations as having no side effects.
The canonicalizer is taking advantage of this to kill them.

--

PiperOrigin-RevId: 242374166
2019-04-07 18:22:46 -07:00
Chris Lattner 72441fcbf2 Change the asmprinter to use pretty syntax for dialect types when it can,
making the IR dumps much nicer.

    This is part 2/3 of the path to making dialect types more nice.  Part 3/3 will
    slightly generalize the set of characters allowed in pretty types and make it
    more principled.

--

PiperOrigin-RevId: 242249955
2019-04-07 18:21:13 -07:00
Chris Lattner 3f93d93367 Introduce support for parsing pretty dialect types, currently with a very
restricted grammar.  This will make certain common types much easier to read.

    This is part tensorflow/mlir#1 of 2, which allows us to accept the new syntax.  Part 2 will
    change the asmprinter to automatically use it when appropriate, which will
    require updating a bunch of tests.

    This is motivated by the EuroLLVM tutorial and cleaning up the LLVM dialect aesthetics a bit more.

--

PiperOrigin-RevId: 242234821
2019-04-07 18:21:02 -07:00
Ashwin Murthy fe1211edf2 Add attr constraint support to constrain IntegerArray attribute values at a specified index.
--

PiperOrigin-RevId: 242209394
2019-04-07 18:20:29 -07:00
Lei Zhang 4cda344e7b Add methods for building array attributes in Builder
I32/I64/F32/F64/Str array attributes are commonly used in ops. It helps
    to have handy methods for them.

--

PiperOrigin-RevId: 242170569
2019-04-07 18:19:56 -07:00
Lei Zhang b5235d1a9c [TableGen] Support array attribute subclasses and constraints
To support automatically constraint composition of ArrayAttr, a new
    predicate combiner, Concat, is introduced. It prepends a prefix and
    appends a postfix to a child predicate's final predicate string.

--

PiperOrigin-RevId: 242121186
2019-04-05 07:43:32 -07:00
River Riddle fde21c6faf NFC: Fix a few typos in the tutorials and one in the comment of FunctionAttr::dropFunctionReference.
--

PiperOrigin-RevId: 242050934
2019-04-05 07:43:05 -07:00
Mehdi Amini d33a9dcc73 Add Chapter 4 for the Toy tutorial: shape inference, function specialization, and basic combines
--

PiperOrigin-RevId: 242050514
2019-04-05 07:42:56 -07:00
River Riddle a8f4b9eeeb Iterate on the operations to fold in TestConstantFold in reverse to remove the need for ConstantFoldHelper to have a flag for insertion at the head of the entry block. This also fixes an asan bug in TestConstantFold due to the iteration order of operations and ConstantFoldHelper's constant insertion placement.
Note: This now means that we cannot fold chains of operations, i.e. where constant foldable operations feed into each other. Given that this is a testing pass solely for constant folding, this isn't really something that we want anyways. Constant fold tests should be simple and direct, with more advanced folding/feeding being tested with the canonicalizer.

--

PiperOrigin-RevId: 242011744
2019-04-05 07:41:52 -07:00
Lei Zhang 4e40c83291 Deduplicate constant folding logic in ConstantFold and GreedyPatternRewriteDriver
There are two places containing constant folding logic right now: the ConstantFold
    pass and the GreedyPatternRewriteDriver. The logic was not shared and started to
    drift apart. We were testing constant folding logic using the ConstantFold pass,
    but lagged behind the GreedyPatternRewriteDriver, where we really want the constant
    folding to happen.

    This CL pulled the logic into utility functions and classes for sharing between
    these two places. A new ConstantFoldHelper class is created to help constant fold
    and de-duplication.

    Also, renamed the ConstantFold pass to TestConstantFold to make it clear that it is
    intended for testing purpose.

--

PiperOrigin-RevId: 241971681
2019-04-05 07:41:32 -07:00
River Riddle 6fa3181329 Remove the non-postorder walk functions from Function/Block/Instruction and rename walkPostOrder to walk.
--

PiperOrigin-RevId: 241965239
2019-04-05 07:41:23 -07:00
Andy Davis d0d1b2a30d Fix bug in LoopTiling where creation of tile-space loop upper bound did not handle symbol operands correctly.
--

PiperOrigin-RevId: 241958502
2019-04-05 07:41:12 -07:00
Andy Davis 55014813e3 Adds dependence analysis support for iteration domains with local variables (enables dependence analysis of loops with non-unit step).
--

PiperOrigin-RevId: 241957023
2019-04-05 07:41:02 -07:00
Lei Zhang 76cb205326 [TableGen] Enforce constraints on attributes
Previously, attribute constraints are basically unused: we set true for almost
    anything. This CL refactors common attribute kinds and sets constraints on
    them properly. And fixed verification failures found by this change.

    A noticeable one is that certain TF ops' attributes are required to be 64-bit
    integer, but the corresponding TFLite ops expect 32-bit integer attributes.
    Added bitwidth converters to handle this difference.

--

PiperOrigin-RevId: 241944008
2019-04-05 07:40:50 -07:00
Lei Zhang c7790df2ed [TableGen] Add PatternSymbolResolver for resolving symbols bound in patterns
We can bind symbols to op arguments/results in source pattern and op results in
    result pattern. Previously resolving these symbols is scattered across
    RewriterGen.cpp. This CL aggregated them into a `PatternSymbolResolver` class.

    While we are here, this CL also cleans up tests for patterns to make them more
    focused. Specifically, one-op-one-result.td is superseded by pattern.td;
    pattern-tAttr.td is simplified; pattern-bound-symbol.td is added for the change
    in this CL.

--

PiperOrigin-RevId: 241913973
2019-04-05 07:40:41 -07:00
Lei Zhang 3c833344c8 [TableGen] Rework verifier generation and error messages
Previously we bundle the existence check and the MLIR attribute kind check
    in one call. Further constraints (like element bitwidth) have to be split
    into following checks. That is not a nice separation given that we have more
    checks for constraints. Instead, this CL changes to generate a local variable
    for every attribute, check its existence first, then check the constraints.

    Creating a local variable for each attribute also avoids querying it multiple
    times using the raw getAttr() API. This is a win for both performance the
    readability of the generated code.

    This CL also changed the error message to be more greppable by delimiting
    the error message from constraints with boilerplate part with colon.

--

PiperOrigin-RevId: 241906132
2019-04-05 07:40:31 -07:00
Mehdi Amini 092f3facad Fix Toy Ch3 testing with CMake
Mainly a missing dependency caused the tests to pass if one already built
    the repo, but not from a clean (or incremental) build.

--

PiperOrigin-RevId: 241852313
2019-04-03 19:22:42 -07:00
Mehdi Amini f0a328b6d5 Chapter 3 for Toy tutorial: introduction of a dialect
--

PiperOrigin-RevId: 241849162
2019-04-03 19:22:32 -07:00
Mehdi Amini 3a2955fa1f Rename UnknownType to OpaqueType (NFC)
This came up in a review of the tutorial, it was suggested that "opaque" is more
    descriptive than "unknown" here.

--

PiperOrigin-RevId: 241832927
2019-04-03 19:21:47 -07:00
Stella Laurenzo 288bf2b5b9 Split the Quantization dialect.
- Retains Quantization types and predicates.
    - Retains utilities and example (testable) passes/ops.
    - Retains unit tests for example passes/ops.
    - Moves fixed point ops (and corresponding real ops) to FxpMathOps.
    - Moves real -> fixed point pass to FxpMathOps.
    - Sever the dependency on the TF dialect from Quantization. These dialects should now be open-sourcable.

--

PiperOrigin-RevId: 241825598
2019-04-03 19:21:28 -07:00
Lei Zhang b9e3b2107b [TableGen] Allow additional result patterns not directly used for replacement
This CL looses the requirement that all result patterns in a rewrite rule must
    replace a result of the root op in the source pattern. Now only the last N
    result pattern-generated ops are used to replace a N-result source op.

    This allows to generate additional ops to aid building up final ops used to
    replace the source op.

--

PiperOrigin-RevId: 241783192
2019-04-03 19:20:22 -07:00
Stella Laurenzo 13bb8f491a Initial release of the Quantization dialect
Includes a draft of documentation for the quantization setup.

Given how many comments such docs have garnered in the past, I've biased towards a lightly edited first-draft so that people can argue about terminology, approach and structure without having spent too much time on it.

Note that the sections under "Uniform quantization" were cribbed nearly verbatim from internal documentation that Daniel wrote.

PiperOrigin-RevId: 241768668
2019-04-03 19:20:12 -07:00
Lei Zhang 3522c65d3b [TableGen] Fix convertFromStorage for OptionalAttr
OptionalAttr is just wrapping around the actual attribute; so it should just use
    the actual attribute's `convertFromStorage` to read the value and wrap it around
    with `Optional<>` to return. Previously it was mandating how the actual attribute
    reads the value with `{0}.getValue()`.

--

PiperOrigin-RevId: 241762355
2019-04-03 18:36:19 -07:00
Lei Zhang 1ac49ce0bd [TableGen] Remove asserts for attributes in aggregate builders
Attributes can have default values or be optional. Checking the validity of
    attributes in aggregate builder should consider that. And to be accurate,
    we should check all required attributes are indeed provided in the list.
    This is actually duplicating the work done by verifier. Checking the validity
    of attributes should be the responsiblity of verifiers. This CL removes
    the assertion for attributes in aggregate builders for the above reason.
    (Assertions for operands/results are still kept since they are trivial.)

    Also added more tests for aggregate builders.

--

PiperOrigin-RevId: 241746059
2019-04-03 09:30:49 -07:00
Alex Zinenko f50edc65cd Drop MLIREmitter-based version of the EDSC
This version has been deprecated and can now be removed completely since the
    last remaining user (Python bindings) migrated to declarative builders.
    Several functions in lib/EDSC/Types.cpp construct core IR objects for the C
    bindings.  Move these functions into lib/EDSC/CoreAPIs.cpp until we decide
    where they should live.

    This completes the migration from the delayed-construction EDSC to Declarative
    Builders.

--

PiperOrigin-RevId: 241716729
2019-04-03 08:30:38 -07:00
Nicolas Vasilache f1b12f5a64 Fix test that fails on non-determinism in LowerVectorTransfers
This CL fixes the non-determinism across compilers in an edsc::select expression used in LowerVectorTransfers. This is achieved by factoring the expression out of the function call to ensure a deterministic order of evaluation.
    Since the expression is now factored out, fewer IR is generated and the test is updated accordingly.

--

PiperOrigin-RevId: 241679962
2019-04-03 01:09:13 -07:00
Alex Zinenko 736bef7386 Introduce custom format for the LLVM IR Dialect
Historically, the LLVM IR dialect has been using the generic form of MLIR
    operation syntax.  It is verbose and often redundant.  Introduce the custom
    printing and parsing for all existing operations in the LLVM IR dialect.
    Update the relevant documentation and tests.

--

PiperOrigin-RevId: 241617393
2019-04-02 16:31:58 -07:00
Mehdi Amini 213dda687b Chapter 2 of the Toy tutorial
This introduces a basic MLIRGen through straight AST traversal,
    without dialect registration at this point.

--

PiperOrigin-RevId: 241588354
2019-04-02 13:41:00 -07:00
River Riddle 67a52c44b1 Rewrite the verify hooks on operations to use LogicalResult instead of bool. This also changes the return of Operation::emitError/emitOpError to LogicalResult as well.
--

PiperOrigin-RevId: 241588075
2019-04-02 13:40:47 -07:00
Mehdi Amini 38b71d6b84 Initial version for chapter 1 of the Toy tutorial
--

PiperOrigin-RevId: 241549247
2019-04-02 13:40:06 -07:00
Andy Davis 7c1fc9e795 Enable producer-consumer fusion for liveout memrefs if consumer read region matches producer write region.
--

PiperOrigin-RevId: 241517207
2019-04-02 13:39:50 -07:00
River Riddle 0451403066 Update the pass ir-printing test to not rely on rtti type pretty printing.
--

PiperOrigin-RevId: 241498090
2019-04-02 13:39:31 -07:00
River Riddle 084669e005 Remove MLPatternLoweringPass and rewrite LowerVectorTransfers to use RewritePattern instead.
--

PiperOrigin-RevId: 241455472
2019-04-02 13:39:17 -07:00
Lei Zhang bae95d25e5 [TableGen] Add Confined, IntMinValue, and ArrayMinCount for attribute constraints
This CL introduces Confined as a general mechanism to compose complex attribute
    constraints out of more primitive ones. It's particularly useful for automatically
    generating op definitions from some external source, where we can have random
    combinations of primitive constraints and it would be impractical to define a case
    for each of such combination.

    Two primitive attribute constraints, IntMinValue and ArrayMinCount, are added to be
    used together with Confined.

--

PiperOrigin-RevId: 241435955
2019-04-02 13:39:03 -07:00
Feng Liu 191aaa82ef Support 0-d tensor type attributes
This CL fixes the parser and printer to support the 0-d tensor type attributes.

--

PiperOrigin-RevId: 241345329
2019-04-01 10:59:59 -07:00
Lei Zhang b9e38a7972 [TableGen] Add EnumAttrCase and EnumAttr
This CL adds EnumAttr as a general mechanism for modelling enum attributes. Right now
    it is using StringAttr under the hood since MLIR does not have native support for enum
    attributes.

--

PiperOrigin-RevId: 241334043
2019-04-01 10:59:31 -07:00
River Riddle 082016d43a Add a flag to Dialect that allows for dialects to enable support for unregistered operations. This flag is off by default and can be toggled via the 'allowUnknownOperations(...)' method. This means that moving forward an error will be emitted for unknown operations if the dialect does not explicitly allow it.
Example:

    func @unknown_std_op() {
      %0 = "std.foo_bar_op"() : () -> index
      return
    }

    Will result in:

    error: unregistered operation 'std.foo_bar_op' found in dialect ('std') that does not allow unknown operations

--

PiperOrigin-RevId: 241266009
2019-04-01 10:59:17 -07:00
Chris Lattner 0fb905c070 Implement basic IR support for a builtin complex<> type. As with tuples, we
have no standard ops for working with these yet, this is simply enough to
    represent and round trip them in the printer and parser.

--

PiperOrigin-RevId: 241102728
2019-03-30 11:23:39 -07:00
Jacques Pienaar 1273af232c Add build files and update README.
* Add initial version of build files;
    * Update README with instructions to download and build MLIR from github;

--

PiperOrigin-RevId: 241102092
2019-03-30 11:23:22 -07:00
Nicolas Vasilache c9d5f3418a Cleanup SuperVectorization dialect printing and parsing.
On the read side,
```
%3 = vector_transfer_read %arg0, %i2, %i1, %i0 {permutation_map: (d0, d1, d2)->(d2, d0)} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
```

becomes:

```
%3 = vector_transfer_read %arg0[%i2, %i1, %i0] {permutation_map: (d0, d1, d2)->(d2, d0)} : memref<?x?x?xf32>, vector<32x256xf32>
```

On the write side,

```
vector_transfer_write %0, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32>, index, index
```

becomes

```
vector_transfer_write %0, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32>
```

Documentation will be cleaned up in a followup commit that also extracts a proper .md from the top of the file comments.

PiperOrigin-RevId: 241021879
2019-03-29 17:56:42 -07:00
Feng Liu 5303587448 [TableGen] Support benefit score in pattern definition.
A integer number can be specified in the pattern definition and used as the
adjustment to the default benefit score in the generated rewrite pattern C++
definition.

PiperOrigin-RevId: 240994192
2019-03-29 17:55:55 -07:00
Nicolas Vasilache 094ca64ab0 Refactor vectorization patterns
This CL removes the reliance of the vectorize pass on the specification of a `fastestVaryingDim` parameter. This parameter is a restriction meant to more easily target a particular loop/memref combination for vectorization and is mainly used for testing.

This also had the side-effect of restricting vectorization patterns to only the ones in which all memrefs were contiguous along the same loop dimension. This simple restriction prevented matmul to vectorize in 2-D.

this CL removes the restriction and adds the matmul test which vectorizes in 2-D along the parallel loops. Support for reduction loops is left for future work.

PiperOrigin-RevId: 240993827
2019-03-29 17:55:36 -07:00
MLIR Team 9d30b36aaf Enable input-reuse fusion to search function arguments for fusion candidates (takes care of a TODO, enables another tutorial test case).
PiperOrigin-RevId: 240979894
2019-03-29 17:54:36 -07:00
River Riddle 106dd08e99 Change the vectorizer test pass to output via diagnostics instead of llvm::outs. This allows for the output to be deterministic when multi-threading is enabled.
PiperOrigin-RevId: 240905858
2019-03-29 17:54:21 -07:00
MLIR Team dd0029e4f6 Support for type constraints across operand and results
--

PiperOrigin-RevId: 240905555
2019-03-29 17:54:06 -07:00
River Riddle 76181a7b38 Remove the LowerEDSCTestPass.
Most of the tests have been ported to be unit-tests and this pass is problematic in the way it depends on TableGen-generated files. This pass is also non-deterministic during multi-threading and a blocker to turning it on by default.

PiperOrigin-RevId: 240889154
2019-03-29 17:53:05 -07:00
River Riddle 909a63d8bf Tidy up a few comments and error messages related to parsing multi-result operations.
PiperOrigin-RevId: 240876306
2019-03-29 17:52:51 -07:00
River Riddle 01140bd137 Change the muli-return syntax for operations. The name of the operation result now contains the number of results that it refers to if the number of results is greater than 1.
Example:
    %call:2 = call @multi_return() : () -> (f32, i32)
    use(%calltensorflow/mlir#0, %calltensorflow/mlir#1)

This cl also adds parser support for uniquely named result values. This means that a test writer can now write something like:
    %foo, %bar = call @multi_return() : () -> (f32, i32)
    use(%foo, %bar)

Note: The printer will still print the collapsed form.
PiperOrigin-RevId: 240860058
2019-03-29 17:51:32 -07:00
MLIR Team 9d9675fc8f Remove overly conservative check in LoopFusion pass (enables fusion in tutorial example).
PiperOrigin-RevId: 240859227
2019-03-29 17:51:16 -07:00
Dimitrios Vytiniotis 79bd6badb2 Remove global LLVM CLI variables from library code
Plus move parsing code into the MLIR CPU runner binary.

PiperOrigin-RevId: 240786709
2019-03-29 17:50:23 -07:00
River Riddle af9760fe18 Replace remaining usages of the Instruction class with Operation.
PiperOrigin-RevId: 240777521
2019-03-29 17:50:04 -07:00
Nicolas Vasilache 31442a66ef Cleanup vectorize_1d.mlir test - NFC
This CL splits a large monolithic test function into smaller ones that are each CHECK-LABEL'd

PiperOrigin-RevId: 240684979
2019-03-29 17:49:45 -07:00
Nicolas Vasilache 4dc7af9da8 Make vectorization aware of loop semantics
Now that we have a dependence analysis, we can check that loops are indeed parallel and make vectorization correct.

PiperOrigin-RevId: 240682727
2019-03-29 17:49:30 -07:00
River Riddle 21547ace87 Update the multi-threaded pass timing to not assume that total time will be different from user time.
PiperOrigin-RevId: 240681618
2019-03-29 17:49:14 -07:00
Jacques Pienaar 8f1e744169 Move test of trait using dialect ops, to dialects of ops.
PiperOrigin-RevId: 240680010
2019-03-29 17:48:59 -07:00
Jacques Pienaar d7e386cea9 Move TF dialect test to dialect.
PiperOrigin-RevId: 240646586
2019-03-29 17:48:28 -07:00
Nicolas Vasilache c3742d20b5 Give the Vectorize pass a virtualVectorSize argument.
This CL allows vectorization to be called and configured in other ways than just via command line arguments.
This allows triggering vectorization programmatically.

PiperOrigin-RevId: 240638208
2019-03-29 17:48:12 -07:00
Lei Zhang d5524388ab [TableGen] Change names for Builder* and OperationState* parameters to avoid collision
The `Builder*` parameter is unused in both generated build() methods so that we can
leave it unnamed. Changed stand-alone parameter build() to take `_tblgen_state` instead
of `result` to allow `result` to avoid having name collisions with op operand,
attribute, or result.

PiperOrigin-RevId: 240637700
2019-03-29 17:47:57 -07:00
River Riddle 3a845be7d1 Add support for multi-threaded pass timing.
When multi-threading is enabled in the pass manager the meaning of the display
slightly changes. First, a new timing column is added, `User Time`, that
displays the total time spent across all threads. Secondly, the `Wall Time`
column displays the longest individual time spent amongst all of the threads.
This means that the `Wall Time` column will continue to give an indicator on the
perceived time, or clock time, whereas the `User Time` will display the total
cpu time.

Example:

$ mlir-opt foo.mlir -experimental-mt-pm -cse -canonicalize -convert-to-llvmir -pass-timing

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0078 seconds

   ---User Time---   ---Wall Time---  --- Name ---
   0.0175 ( 88.3%)     0.0055 ( 70.4%)  Function Pipeline
   0.0018 (  9.3%)     0.0006 (  8.1%)    CSE
   0.0013 (  6.3%)     0.0004 (  5.8%)      (A) DominanceInfo
   0.0017 (  8.7%)     0.0006 (  7.1%)    FunctionVerifier
   0.0128 ( 64.6%)     0.0039 ( 50.5%)    Canonicalizer
   0.0011 (  5.7%)     0.0004 (  4.7%)    FunctionVerifier
   0.0004 (  2.1%)     0.0004 (  5.2%)  ModuleVerifier
   0.0010 (  5.3%)     0.0010 ( 13.4%)  LLVMLowering
   0.0009 (  4.3%)     0.0009 ( 11.0%)  ModuleVerifier
   0.0198 (100.0%)     0.0078 (100.0%)  Total

PiperOrigin-RevId: 240636269
2019-03-29 17:47:41 -07:00
Jacques Pienaar 810e95b861 Use dereference instead of implicit conversion for IndexedValue to Value*.
Avoids ambiguous constructor error on some compilers.

PiperOrigin-RevId: 240606838
2019-03-29 17:45:56 -07:00
Alex Zinenko e2f9079a71 LLVM IR Conversion: support zero-dimensional memrefs
The spec allows zero-dimensional memrefs to exist and treats them essentially
as single-element buffers.  Unlike single-dimensional memrefs of static shape
<1xTy>, zero-dimensional memrefs do not require indices to access the only
element they store.  Add support of zero-dimensional memrefs to the LLVM IR
conversion.  In particular, such memrefs are converted into bare pointers, and
accesses to them are converted to bare loads and stores, without the overhead
of `getelementptr %buffer, 0`.

PiperOrigin-RevId: 240579456
2019-03-29 17:45:26 -07:00
Nicolas Vasilache 04b925f1b8 Port api-test::tile_2d to the edsc::Builder API
The AST-based EDSCs implementation will be retired soon, this test was missing from the builders API.

PiperOrigin-RevId: 240547453
2019-03-29 17:44:40 -07:00
Alex Zinenko 5a5bba0279 Introduce affine terminator
Due to legacy reasons (ML/CFG function separation), regions in affine control
flow operations require contained blocks not to have terminators.  This is
inconsistent with the notion of the block and may complicate code motion
between regions of affine control operations and other regions.

Introduce `affine.terminator`, a special terminator operation that must be used
to terminate blocks inside affine operations and transfers the control back to
he region enclosing the affine operation.  For brevity and readability reasons,
allow `affine.for` and `affine.if` to omit the `affine.terminator` in their
regions when using custom printing and parsing format.  The custom parser
injects the `affine.terminator` if it is missing so as to always have it
present in constructed operations.

Update transformations to account for the presence of terminator.  In
particular, most code motion transformation between loops should leave the
terminator in place, and code motion between loops and non-affine blocks should
drop the terminator.

PiperOrigin-RevId: 240536998
2019-03-29 17:44:24 -07:00
River Riddle f9d91531df Replace usages of Instruction with Operation in the /IR directory.
This is step 2/N to renaming Instruction to Operation.

PiperOrigin-RevId: 240459216
2019-03-29 17:43:37 -07:00
Feng Liu c489f50e6f Add a trait to set the result type by attribute
Before this CL, the result type of the pattern match results need to be as same
as the first operand type, operand broadcast type or a generic tensor type.
This CL adds a new trait to set the result type by attribute. For example, the
TFL_ConstOp can use this to set the output type to its value attribute.

PiperOrigin-RevId: 240441249
2019-03-29 17:43:06 -07:00
Nicolas Vasilache 8811e284e8 Add an IndexedValue::operator Value*
This avoids the need to explicitly convert to a ValueHandle when using an Indexed where a Value* is expected.

PiperOrigin-RevId: 240371014
2019-03-29 17:41:17 -07:00
Alex Zinenko a7215a9032 Allow creating standalone Regions
Currently, regions can only be constructed by passing in a `Function` or an
`Instruction` pointer referencing the parent object, unlike `Function`s or
`Instruction`s themselves that can be created without a parent.  It leads to a
rather complex flow in operation construction where one has to create the
operation first before being able to work with its regions.  It may be
necessary to work with the regions before the operation is created.  In
particular, in `build` and `parse` functions that are executed _before_ the
operation is created in cases where boilerplate region manipulation is required
(for example, inserting the hypothetical default terminator in affine regions).
Allow creating standalone regions.  Such regions are meant to own a list of
blocks and transfer them to other regions on demand.

Each instruction stores a fixed number of regions as trailing objects and has
ownership of them.  This decreases the size of the Instruction object for the
common case of instructions without regions.  Keep this behavior intact.  To
allow some flexibility in construction, make OperationState store an owning
vector of regions.  When the Builder creates an Instruction from
OperationState, the bodies of the regions are transferred into the
instruction-owned regions to minimize copying.  Thus, it becomes possible to
fill standalone regions with blocks and move them to an operation when it is
constructed, or move blocks from a region to an operation region, e.g., for
inlining.

PiperOrigin-RevId: 240368183
2019-03-29 17:40:59 -07:00
River Riddle 832567b379 NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for' and set the namespace of the AffineOps dialect to 'affine'.
PiperOrigin-RevId: 240165792
2019-03-29 17:39:03 -07:00
River Riddle 9c6e92360c NFC: Rename the 'if' operation in the AffineOps dialect to 'affine.if'.
PiperOrigin-RevId: 240071154
2019-03-29 17:36:53 -07:00
Chris Lattner d9b5bc8f55 Remove OpPointer, cleaning up a ton of code. This also moves Ops to using
inherited constructors, which is cleaner and means you can now use DimOp()
to get a null op, instead of having to use Instruction::getNull<DimOp>().

This removes another 200 lines of code.

PiperOrigin-RevId: 240068113
2019-03-29 17:36:21 -07:00
Jacques Pienaar 7ab37aaf02 Fix missing parenthesis around negation.
This should probably be changed to instead use the negated form (e.g., get predicate + negate it + get resulting template), but this fixes it locally.

PiperOrigin-RevId: 240067116
2019-03-29 17:36:06 -07:00
Nicolas Vasilache f26c7cd792 Cleanup ValueHandleArray
We just need a way to unpack ArrayRef<ValueHandle> to ArrayRef<Value*>.
No need to expose this to the user.

This reduces the cognitive overhead for the tutorial.

PiperOrigin-RevId: 240037425
2019-03-29 17:35:20 -07:00
Chris Lattner 986310a68f Remove const from Value, Instruction, Argument, and the various methods on the
*Op classes.  This is a net reduction by almost 400LOC.

PiperOrigin-RevId: 239972443
2019-03-29 17:34:33 -07:00
Jacques Pienaar b236041b93 Return operand_range instead for generated variadic operands accessor.
PiperOrigin-RevId: 239959381
2019-03-29 17:34:17 -07:00
Chris Lattner 5246bceee0 Now that ConstOpPointer is gone, we can change the various methods generated by
tblgen be non-const.  This requires introducing some const_cast's at the
moment, but those (and lots more stuff) will disappear in subsequent patches.

This significantly simplifies those patches because the various tblgen op emitters
get adjusted.

PiperOrigin-RevId: 239954566
2019-03-29 17:33:45 -07:00
River Riddle fdef161592 Remove "<label>" from the llvm basic block CHECK names.
PiperOrigin-RevId: 239898185
2019-03-29 17:32:06 -07:00
Jacques Pienaar 5546733ec4 Start elemental type constraint specification modelling.
Enable users specifying operand type constraint combinations (e.g., considering multiple operands). Some of these will be refactored (particularly the OpBase change and that should also not be needed to be done by most users), but the focus is more on user side (shown in test). The generated code for this does not take any known facts into account or perform any simplification.

Start with 2 primities to specify 1) whether an operand has a specific element type, and 2) whether an operand's element type matches another operands element type.

PiperOrigin-RevId: 239875712
2019-03-29 17:31:50 -07:00
Nicolas Vasilache 071ca8da91 Support composition of symbols in AffineApplyOp
This CL revisits the composition of AffineApplyOp for the special case where a symbol
itself comes from an AffineApplyOp.
This is achieved by rewriting such symbols into dims to allow composition to occur mathematically.
The implementation is also refactored to improve readability.

Rationale for locally rewriting symbols as dims:
================================================
The mathematical composition of AffineMap must always concatenate symbols
because it does not have enough information to do otherwise. For example,
composing `(d0)[s0] -> (d0 + s0)` with itself must produce
`(d0)[s0, s1] -> (d0 + s0 + s1)`.

The result is only equivalent to `(d0)[s0] -> (d0 + 2 * s0)` when
applied to the same mlir::Value* for both s0 and s1.
As a consequence mathematical composition of AffineMap always concatenates
symbols.

When AffineMaps are used in AffineApplyOp however, they may specify
composition via symbols, which is ambiguous mathematically. This corner case
is handled by locally rewriting such symbols that come from AffineApplyOp
into dims and composing through dims.

PiperOrigin-RevId: 239791597
2019-03-29 17:30:59 -07:00
Nicolas Vasilache e21c101037 Add intrinsics for constants
PiperOrigin-RevId: 239596595
2019-03-29 17:28:12 -07:00
Lei Zhang a09dc8a491 [TableGen] Generate op declaration and definition into different files
Previously we emit both op declaration and definition into one file and include it
in *Ops.h. That pulls in lots of implementation details in the header file and we
cannot hide symbols local to implementation. This CL splits them to provide a cleaner
interface.

The way how we define custom builders in TableGen is changed accordingly because now
we need to distinguish signatures and implementation logic. Some custom builders with
complicated logic now can be moved to be implemented in .cpp entirely.

PiperOrigin-RevId: 239509594
2019-03-29 17:26:26 -07:00
Nicolas Vasilache d6c650cfb5 Properly propagate induction variable in tiling
This CL fixes an issue where cloned loop induction variables were not properly
propagated and beefs up the corresponding test.

PiperOrigin-RevId: 239422961
2019-03-29 17:25:53 -07:00
River Riddle 30e68230bd Add support for a standard TupleType. Though this is a standard type, it merely provides a common mechanism for representing tuples in MLIR. It is up to dialect authors to provides operations for manipulating them, e.g. extract_tuple_element.
TupleType has the following form:
   tuple-type ::= `tuple` `<` (type (`,` type)*)? `>`

Example:

// Empty tuple.
tuple<>

// Single element.
tuple<i32>

// Multi element.
tuple<i32, tuple<f32>, i16>

PiperOrigin-RevId: 239226021
2019-03-29 17:25:09 -07:00
Jacques Pienaar 57270a9a99 Remove some statements that required >C++11, add includes and qualify names. NFC.
PiperOrigin-RevId: 239197784
2019-03-29 17:24:53 -07:00
River Riddle 6d6ff7298a Add support for parsing true/false inside of a splat tensor literal.
PiperOrigin-RevId: 239052061
2019-03-29 17:24:09 -07:00
Nicolas Vasilache a89d8c0a1a Port Tablegen'd reference implementation of Add to declarative builders.
PiperOrigin-RevId: 238977252
2019-03-29 17:22:36 -07:00
Nicolas Vasilache 3a12bc5041 Remove LOAD/STORE/RETURN boilerplate in declarative builders.
This CL introduces a ValueArrayHandle helper to manage the implicit conversion
of ArrayRef<ValueHandle> -> ArrayRef<Value*> by converting first to ValueArrayHandle.
Without this, boilerplate operations that take ArrayRef<Value*> cannot be removed easily.

This all seems to boil down to decoupling Value from Type.
Alternative solutions exist (e.g. MLIR using Value by value everywhere) but they would be very intrusive. This seems to be the lowest impedance change.

Intrinsics are also lowercased by popular demand.

PiperOrigin-RevId: 238974125
2019-03-29 17:22:20 -07:00
Nicolas Vasilache f43388e4ce Port LowerVectorTransfers from EDSC + AST to declarative builders
This CL removes the dependency of LowerVectorTransfers on the AST version of EDSCs which will be retired.

This exhibited a pretty fundamental staging difference in AST-based vs declarative based emission.

Since the delayed creation with an AST was staged, the loop order came into existence after the clipping expressions were computed.
This now changes as the loops first need to be created declaratively in fixed order and then the clipping expressions are created.
Also, due to lack of staging, coalescing cannot be done on the fly anymore and
needs to be done either as a pre-pass (current implementation) or as a local transformation on the generated IR (future work).

Tests are updated accordingly.

PiperOrigin-RevId: 238971631
2019-03-29 17:22:06 -07:00
River Riddle 076a7350e2 Add an instrumentation for conditionally printing the IR before and after pass execution. This instrumentation can be added directly to the PassManager via 'enableIRPrinting'. mlir-opt exposes access to this instrumentation via the following flags:
* print-ir-before=(comma-separated-pass-list)
  - Print the IR before each of the passes provided within the pass list.
* print-ir-before-all
  - Print the IR before every pass in the pipeline.
* print-ir-after=(comma-separated-pass-list)
  - Print the IR after each of the passes provided within the pass list.
* print-ir-after-all
  - Print the IR after every pass in the pipeline.
* print-ir-module-scope
  - Always print the Module IR, even for non module passes.

PiperOrigin-RevId: 238523649
2019-03-29 17:19:57 -07:00
Alex Zinenko 276fae1b0d Rename BlockList into Region
NFC.  This is step 1/n to specifying regions as parts of any operation.

PiperOrigin-RevId: 238472370
2019-03-29 17:18:04 -07:00
Uday Bondhugula e1e455f7dd Change parallelism detection test pass to emit a note
- emit a note on the loop being parallel instead of setting a loop attribute
- rename the pass -test-detect-parallel (from -detect-parallel)

PiperOrigin-RevId: 238122847
2019-03-29 17:16:27 -07:00
Feng Liu c52a812700 [TableGen] Support nested dag attributes arguments in the result pattern
Add support to create a new attribute from multiple attributes. It extended the
DagNode class to represent attribute creation dag. It also changed the
RewriterGen::emitOpCreate method to support this nested dag emit.

An unit test is added.

PiperOrigin-RevId: 238090229
2019-03-29 17:15:57 -07:00
Uday Bondhugula 9f2781e8dd Fix misc bugs / TODOs / other improvements to analysis utils
- fix for getConstantBoundOnDimSize: floordiv -> ceildiv for extent
- make getConstantBoundOnDimSize also return the identifier upper bound
- fix unionBoundingBox to correctly use the divisor and upper bound identified by
  getConstantBoundOnDimSize
- deal with loop step correctly in addAffineForOpDomain (covers most cases now)
- fully compose bound map / operands and simplify/canonicalize before adding
  dim/symbol to FlatAffineConstraints; fixes false positives in -memref-bound-check; add
  test case there
- expose mlir::isTopLevelSymbol from AffineOps

PiperOrigin-RevId: 238050395
2019-03-29 17:15:27 -07:00
Uday Bondhugula 075090f891 Extend loop unrolling and unroll-jamming to non-matching bound operands and
multi-result upper bounds, complete TODOs, fix/improve test cases.

- complete TODOs for loop unroll/unroll-and-jam. Something as simple as
  "for %i = 0 to %N" wasn't being unrolled earlier (unless it had been written
  as "for %i = ()[s0] -> (0)()[%N] to %N"; addressed now.

- update/replace getTripCountExpr with buildTripCountMapAndOperands; makes it
  more powerful as it composes inputs into it

- getCleanupLowerBound and getUnrolledLoopUpperBound actually needed the same
  code; refactor and remove one.

- reorganize test cases, write previous ones better; most of these changes are
  "label replacements".

- fix wrongly labeled test cases in unroll-jam.mlir

PiperOrigin-RevId: 238014653
2019-03-29 17:14:12 -07:00
MLIR Team 8d62a6092f Clean up some stray mlfunc/cfgfunc leftovers.
PiperOrigin-RevId: 237936610
2019-03-29 17:13:26 -07:00
Nicolas Vasilache dfd904d4a9 Minor changes to the EDSC API NFC
This CL makes some minor changes to the declarative builder Helpers:
1. adds lb, ub, step methods to MemRefView to avoid always having to go through std::get + range;
2. drops MemRefView& from IndexedValue which was just creating ownership concerns. Instead, an IndexedValue only needs to keep track of the ValueHandle from which a MemRefView can be constructed on-demand if necessary.

PiperOrigin-RevId: 237861493
2019-03-29 17:12:41 -07:00
Lei Zhang e1595df1af Allow input and output to have different element types for broadcastable ops
TensorFlow comparison ops like tf.Less supports broadcast behavior but the result
type have different element types as the input types. Extend broadcastable trait
to allow such cases. Added tf.Less to demonstrate it.

PiperOrigin-RevId: 237846127
2019-03-29 17:12:26 -07:00
Lei Zhang 7972dcef84 Pull shape broadcast out as a stand-alone utility function
So that we can use this function to deduce broadcasted shapes elsewhere.

Also added support for unknown dimensions, by following TensorFlow behavior.

PiperOrigin-RevId: 237846065
2019-03-29 17:12:11 -07:00
River Riddle e46ba31c66 Add a new instrumentation for timing pass and analysis execution. This is made available in mlir-opt via the 'pass-timing' and 'pass-timing-display' flags. The 'pass-timing-display' flag toggles between the different available display modes for the timing results. The current display modes are 'list' and 'pipeline', with 'list' representing the default.
Below shows the output for an example mlir-opt command line.

mlir-opt foo.mlir -verify-each=false -cse -canonicalize -cse -cse -pass-timing

list view (-pass-timing-display=list):
* In this mode the results are displayed in a list sorted by total time; with each pass/analysis instance aggregated into one unique result. This mode is similar to the output of 'time-passes' in llvm-opt.

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0097 seconds (0.0096 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0051 ( 58.3%)   0.0001 ( 12.2%)   0.0052 ( 53.8%)   0.0052 ( 53.8%)  Canonicalizer
   0.0025 ( 29.1%)   0.0005 ( 58.2%)   0.0031 ( 31.9%)   0.0031 ( 32.0%)  CSE
   0.0011 ( 12.6%)   0.0003 ( 29.7%)   0.0014 ( 14.3%)   0.0014 ( 14.2%)  DominanceInfo
   0.0087 (100.0%)   0.0009 (100.0%)   0.0097 (100.0%)   0.0096 (100.0%)  Total

pipeline view (-pass-timing-display=pipeline):
* In this mode the results are displayed in a nested pipeline view that mirrors the internal pass pipeline that is being executed in the pass manager. This view is useful for understanding specifically which parts of the pipeline are taking the most time, and can also be used to identify when analyses are being invalidated and recomputed.

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0082 seconds (0.0081 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0042 (100.0%)   0.0039 (100.0%)   0.0082 (100.0%)   0.0081 (100.0%)  Function Pipeline
   0.0005 ( 11.6%)   0.0008 ( 21.1%)   0.0013 ( 16.1%)   0.0013 ( 16.2%)    CSE
   0.0002 (  5.0%)   0.0004 (  9.3%)   0.0006 (  7.0%)   0.0006 (  7.0%)      (A) DominanceInfo
   0.0026 ( 61.8%)   0.0018 ( 45.6%)   0.0044 ( 54.0%)   0.0044 ( 54.1%)    Canonicalizer
   0.0005 ( 11.7%)   0.0005 ( 13.0%)   0.0010 ( 12.3%)   0.0010 ( 12.4%)    CSE
   0.0003 (  6.1%)   0.0003 (  8.3%)   0.0006 (  7.2%)   0.0006 (  7.1%)      (A) DominanceInfo
   0.0002 (  3.8%)   0.0001 (  2.8%)   0.0003 (  3.3%)   0.0003 (  3.3%)    CSE
   0.0042 (100.0%)   0.0039 (100.0%)   0.0082 (100.0%)   0.0081 (100.0%)  Total

PiperOrigin-RevId: 237825367
2019-03-29 17:11:25 -07:00
Nicolas Vasilache 861eb87471 [EDSC] Cleanup declarative builder insertion point with blocks
Declarative builders want to provide the same nesting interface for blocks and loops. MLIR on the other hand has different behaviors:
1. when an AffineForOp is created the insertion point does not enter the loop body;
2. when a Block is created, the insertion point does enter the block body.

Guard against the second behavior in EDSC to make the interface unsurprising.
This also surfaces two places in the eager branch API where I was guarding against this behavior indirectly by creating a new ScopedContext.

Instead, uniformize everything to properly reset the insertion point in the unique place that builds the mlir::Block*.

PiperOrigin-RevId: 237619513
2019-03-29 17:09:51 -07:00
Nicolas Vasilache 0d925c5510 Follow up on custom instruction support.
This CL addresses a few post-submit comments:
1. better comments,
2. check number of results before dyn_cast (which is a less common case)
3. test usage for multi-result InstructionHandle

PiperOrigin-RevId: 237549333
2019-03-29 17:09:20 -07:00
Nicolas Vasilache eb19b4eefc Add support for custom ops in declarative builders.
This CL adds support for named custom instructions in declarative builders.
To allow this, it introduces a templated `CustomInstruction` class.
This CL also splits ValueHandle which can capture only the **value** in single-valued instructions from InstructionHandle which can capture any instruction but provide no typing and sugaring to extract the potential Value*.

PiperOrigin-RevId: 237543222
2019-03-29 17:09:05 -07:00
Lei Zhang 684cc6e8da [TableGen] Change to attach the name to DAG operator in result patterns
There are two ways that we can attach a name to a DAG node:

1) (Op:$name ...)
2) (Op ...):$name

The problem with 2) is that we cannot do it on the outmost DAG node in a tree.
Switch from 2) to 1).

PiperOrigin-RevId: 237513962
2019-03-29 17:08:05 -07:00
Lei Zhang 18fde7c9d8 [TableGen] Support multiple result patterns
This CL added the ability to generate multiple ops using multiple result
patterns, with each of them replacing one result of the matched source op.

Specifically, the syntax is

```
def : Pattern<(SourceOp ...),
              [(ResultOp1 ...), (ResultOp2 ...), (ResultOp3 ...)]>;
```

Assuming `SourceOp` has three results.

Currently we require that each result op must generate one result, which
can be lifted later when use cases arise.

To help with cases that certain output is unused and we don't care about it,
this CL also introduces a new directive: `verifyUnusedValue`. Checks will
be emitted in the `match()` method to make sure if the corresponding output
is not unused, `match()` returns with `matchFailure()`.

PiperOrigin-RevId: 237513904
2019-03-29 17:07:50 -07:00
Uday Bondhugula ce7e59536c Add a basic model to set tile sizes + some cleanup
- compute tile sizes based on a simple model that looks at memory footprints
  (instead of using the hardcoded default value)
- adjust tile sizes to make them factors of trip counts based on an option
- update loop fusion CL options to allow setting maximal fusion at pass creation
- change an emitError to emitWarning (since it's not a hard error unless the client
  treats it that way, in which case, it can emit one)

$ mlir-opt -debug-only=loop-tile -loop-tile test/Transforms/loop-tiling.mlir

test/Transforms/loop-tiling.mlir:81:3: note: using tile sizes [4 4 5 ]

  for %i = 0 to 256 {

for %i0 = 0 to 256 step 4 {
    for %i1 = 0 to 256 step 4 {
      for %i2 = 0 to 250 step 5 {
        for %i3 = #map4(%i0) to #map11(%i0) {
          for %i4 = #map4(%i1) to #map11(%i1) {
            for %i5 = #map4(%i2) to #map12(%i2) {
              %0 = load %arg0[%i3, %i5] : memref<8x8xvector<64xf32>>
              %1 = load %arg1[%i5, %i4] : memref<8x8xvector<64xf32>>
              %2 = load %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
              %3 = mulf %0, %1 : vector<64xf32>
              %4 = addf %2, %3 : vector<64xf32>
              store %4, %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
            }
          }
        }
      }
    }
  }

PiperOrigin-RevId: 237461836
2019-03-29 17:06:51 -07:00
Nicolas Vasilache 9e425a06f7 Fix an incorrect comment in builder-api-test.
Also address post commit cleanups that were missed.

PiperOrigin-RevId: 237122077
2019-03-29 17:03:00 -07:00
Nicolas Vasilache 7c0b9e8b62 Add helper classes to declarative builders to help write end-to-end custom ops.
This CL adds the same helper classes that exist in the AST form of EDSCs to support a basic indexing notation and emit the proper load and store operations and capture MemRefViews as function arguments.

This CL also adds a wrapper class LoopNestBuilder to allow generic rank-agnostic loops over indices.

PiperOrigin-RevId: 237113755
2019-03-29 17:02:41 -07:00
Lei Zhang 4fc9b51727 [TableGen] Emit verification code for op results
They can be verified using the same logic as operands.

PiperOrigin-RevId: 237101461
2019-03-29 17:02:26 -07:00
Dimitrios Vytiniotis 32943f5783 More graceful failure when verifying llvm.noalias.
PiperOrigin-RevId: 237081778
2019-03-29 17:01:56 -07:00
Dimitrios Vytiniotis 480cc2b063 Using llvm.noalias attribute when generating LLVMIR.
PiperOrigin-RevId: 237063104
2019-03-29 17:01:11 -07:00
Nicolas Vasilache ee4a80bbd6 Add an eager API version for BR and COND_BR
When building unstructured control-flow there is a need to construct mlir::Block* before being able to fill them. This invites goto-style programming.

This CL introduces an alternative eager API for BR and COND_BR in which blocks are created eagerly and captured on the fly.

This allows reducing the number of calls to `BlockBuilder` from 4 to 2 in the `builder_blocks_eager` test and from 3 to 2 in the `builder_cond_branch_eager` test.

PiperOrigin-RevId: 237046114
2019-03-29 17:00:26 -07:00
Nicolas Vasilache 38f1d2d77e Add support for Branches in edsc::Builder
This CL adds support for BranchHandle and BranchBuilder that require a slightly different
abstraction since an mlir::Block is not an mlir::Value.

This CL also adds support for the BR and COND_BR instructions and the relevant tests.

PiperOrigin-RevId: 237034312
2019-03-29 17:00:09 -07:00
Nicolas Vasilache af6c3f7a63 Start a new implementation for edsc::Builder
This CL reworks the design of EDSCs from first principles.
It introduces a ValueHandle which can hold either:
1. an eagerly typed, delayed Value*
2. a precomputed Value*

A ValueHandle can be manipulated with intrinsic operations a nested within a NestedBuilder. These NestedBuilder are a more idiomatic nested builder abstraction that should feel intuitive to program in C++.

Notably, this abstraction does not require an AST to stage the construction of MLIR snippets from C++. Instead, the abstraction makes use of orderings between definition and declaration of ValueHandles and provides a NestedBuilder and a LoopBuilder helper classes to handle those orderings.

All instruction creations are meant to use the templated ValueHandle::create<> which directly calls mlir::Builder.create<>.

For now the EDSC AST and the builders live side-by-side until the C API is ported.

PiperOrigin-RevId: 237030945
2019-03-29 16:59:50 -07:00
Alex Zinenko 95949a0d09 TableGen: allow mixing attributes and operands in the Arguments DAG of Op definition
The existing implementation of the Op definition generator assumes and relies
on the fact that native Op Attributes appear after its value-based operands in
the Arguments list.  Furthermore, the same order is used in the generated
`build` function for the operation.  This is not desirable for some operations
with mandatory attributes that would want the attribute to appear upfront for
better consistency with their textual representation, for example `cmpi` would
prefer the `predicate` attribute to be foremost in the argument list.

Introduce support for using attributes and operands in the Arguments DAG in no
particular order.  This is achieved by maintaining a list of Arguments that
point to either the value or the attribute and are used to generate the `build`
method.

PiperOrigin-RevId: 237002921
2019-03-29 16:59:35 -07:00
MLIR Team c1ff9e866e Use FlatAffineConstraints::unionBoundingBox to perform slice bounds union for loop fusion pass (WIP).
Adds utility to convert slice bounds to a FlatAffineConstraints representation.
Adds utility to FlatAffineConstraints to promote loop IV symbol identifiers to dim identifiers.

PiperOrigin-RevId: 236973261
2019-03-29 16:59:21 -07:00
Uday Bondhugula 5836fae8a0 DMA generation CL flag update
- allow mem capacity to be overridden by command-line flag
- change default fast mem space to 2

PiperOrigin-RevId: 236951598
2019-03-29 16:59:05 -07:00
Uday Bondhugula 7e288e7c19 Add missing run command to fusion test cases - follow up to cl/236882988
PiperOrigin-RevId: 236947383
2019-03-29 16:58:50 -07:00
Uday Bondhugula b34f8d3c83 Fix and improve detectAsMod
- fix for the mod detection
- simplify/avoid the mod at construction (if the dividend is already known to be less
  than the divisor), since the information is available at hand there

PiperOrigin-RevId: 236882988
2019-03-29 16:57:36 -07:00
Nicolas Vasilache 069c818f40 Fix lower/upper bound mismatch in stripmineSink
Also beef up the corresponding test case.

PiperOrigin-RevId: 236878818
2019-03-29 16:57:21 -07:00
Alex Zinenko dd75675080 TableGen: fix builder generation for optional attributes
The recently introduced support for generating MLIR Operations with optional
attributes did not handle the formatted string emission properly, in particular
it did not escape `{` and `}` in calls to `formatv` leading to assertions
during TableGen op definition generation.  Fix this by splitting out the
unncessary braces from the format string.  Additionally, fix the emission of
the builder argument comment to correctly indicate which attributes are indeed
optional and which are not.

PiperOrigin-RevId: 236832230
2019-03-29 16:56:50 -07:00
Uday Bondhugula a77734e185 Make sure that fusion test cases don't have out of bounds accesses
- fix out of bounds test case
- -memref-bound-check on the test/Transforms/loop-fusion.mlir no longer reports any
  errors, before or after -loop-fusion is run

PiperOrigin-RevId: 236757658
2019-03-29 16:56:35 -07:00
MLIR Team 39a1ddeb1c Adds loop attribute as a temporary work around to prevent slice fusion of loop nests containing instructions with side effects (the proper solution will be do use memref read/write regions in the future).
PiperOrigin-RevId: 236733739
2019-03-29 16:56:20 -07:00
Uday Bondhugula 12b9dece8d Bug fix for getConstantBoundOnDimSize
- this was detected when memref-bound-check was run on the output of the
  loop-fusion pass
- the addition (to represent ceildiv as a floordiv) had to be performed only
  for the constant term of the constraint
- update test cases
- memref-bound-check no longer returns an error on the output of this test case

PiperOrigin-RevId: 236731137
2019-03-29 16:56:06 -07:00
Dimitrios Vytiniotis a60ba7d908 Supporting conversion of argument attributes along their types.
This fixes a bug: previously, during conversion function argument
attributes were neither beings passed through nor converted. This fix
extends DialectConversion to allow for simultaneous conversion of the
function type and the argument attributes.

This was important when lowering MLIR to LLVM where attribute
information (e.g. noalias) needs to be preserved in MLIR(LLVMDialect).

Longer run it seems reasonable that we want to convert both the
function attribute and its type and the argument attributes, but that
requires a small refactoring in Function.h to aggregate these three
fields in an inner struct, which will require some discussion.

PiperOrigin-RevId: 236709409
2019-03-29 16:55:51 -07:00
River Riddle a495f960e0 Introduce the notion of dialect attributes and dependent attributes. A dialect attribute derives its context from a specific dialect, whereas a dependent attribute derives context from what it is attached to. Following this, we now enforce that functions and function arguments may only contain dialect specific attributes. These are generic entities and cannot provide any specific context for a dependent attribute.
Dialect attributes are defined as:

        dialect-namespace `.` attr-name `:` attribute-value

Dialects can override any of the following hooks to verify the validity of a given attribute:
  * verifyFunctionAttribute
  * verifyFunctionArgAttribute
  * verifyInstructionAttribute

PiperOrigin-RevId: 236507970
2019-03-29 16:55:05 -07:00
River Riddle eeeef090ef Set the namespace of the StandardOps dialect to "std", but add a special case to the parser to allow parsing standard operations without the "std" prefix. This will now allow for the standard dialect to be looked up dynamically by name.
PiperOrigin-RevId: 236493865
2019-03-29 16:54:20 -07:00
River Riddle f37651c708 NFC. Move all of the remaining operations left in BuiltinOps to StandardOps. The only thing left in BuiltinOps are the core MLIR types. The standard types can't be moved because they are referenced within the IR directory, e.g. in things like Builder.
PiperOrigin-RevId: 236403665
2019-03-29 16:53:35 -07:00
Lei Zhang 85d9b6c8f7 Use consistent names for dialect op source files
This CL changes dialect op source files (.h, .cpp, .td) to follow the following
convention:

  <full-dialect-name>/<dialect-namespace>Ops.{h|cpp|td}

Builtin and standard dialects are specially treated, though. Both of them do
not have dialect namespace; the former is still named as BuiltinOps.* and the
latter is named as Ops.*.

Purely mechanical. NFC.

PiperOrigin-RevId: 236371358
2019-03-29 16:53:19 -07:00
Uday Bondhugula 8254aabd4a A simple pass to detect and mark all parallel loops
- detect all parallel loops based on dep information and mark them with a
  "parallel" attribute
- add mlir::isLoopParallel(OpPointer<AffineForOp> ...), and refactor an existing method
  to use that (reuse some code from @andydavis (cl/236007073) for this)
- a simple/meaningful way to test memref dep test as well

Ex:

$ mlir-opt -detect-parallel test/Transforms/parallelism-detection.mlir
#map1 = ()[s0] -> (s0)
func @foo(%arg0: index) {
  %0 = alloc() : memref<1024x1024xvector<64xf32>>
  %1 = alloc() : memref<1024x1024xvector<64xf32>>
  %2 = alloc() : memref<1024x1024xvector<64xf32>>
  for %i0 = 0 to %arg0 {
    for %i1 = 0 to %arg0 {
      for %i2 = 0 to %arg0 {
        %3 = load %0[%i0, %i2] : memref<1024x1024xvector<64xf32>>
        %4 = load %1[%i2, %i1] : memref<1024x1024xvector<64xf32>>
        %5 = load %2[%i0, %i1] : memref<1024x1024xvector<64xf32>>
        %6 = mulf %3, %4 : vector<64xf32>
        %7 = addf %5, %6 : vector<64xf32>
        store %7, %2[%i0, %i1] : memref<1024x1024xvector<64xf32>>
      } {parallel: false}
    } {parallel: true}
  } {parallel: true}
  return
}

PiperOrigin-RevId: 236367368
2019-03-29 16:53:03 -07:00
MLIR Team d038e34735 Loop fusion for input reuse.
*) Breaks fusion pass into multiple sub passes over nodes in data dependence graph:
- first pass fuses single-use producers into their unique consumer.
- second pass enables fusing for input-reuse by fusing sibling nodes which read from the same memref, but which do not share dependence edges.
- third pass fuses remaining producers into their consumers (Note that the sibling fusion pass may have transformed a producer with multiple uses into a single-use producer).
*) Fusion for input reuse is enabled by computing a sibling node slice using the load/load accesses to the same memref, and fusion safety is guaranteed by checking that the sibling node memref write region (to a different memref) is preserved.
*) Enables output vector and output matrix computations from KFAC patches-second-moment operation to fuse into a single loop nest and reuse input from the image patches operation.
*) Adds a generic loop utilitiy for finding all sequential loops in a loop nest.
*) Adds and updates unit tests.

PiperOrigin-RevId: 236350987
2019-03-29 16:52:35 -07:00
River Riddle 269c872ee8 Add support for parsing and printing affine.if and affine.for attributes. The attribute dictionaries are printed after the final block list for both operations:
for %i = 0 to 10 {
     ...
  } {some_attr: true}

  if () : () () {
    ...
  } {some_attr: true}

  if () : () () {
    ...
  } else {
    ...
  } {some_attr: true}

PiperOrigin-RevId: 236346983
2019-03-29 16:52:19 -07:00
Uday Bondhugula 932e4fb29f Analysis support for floordiv/mod's in loop bounds/
- handle floordiv/mod's in loop bounds for all analysis purposes
- allows fusion slicing to be more powerful
- add simple test cases based on -memref-bound-check
- fusion based test cases in follow up CLs

PiperOrigin-RevId: 236328551
2019-03-29 16:52:04 -07:00
Uday Bondhugula 58889884a2 Change some of the debug messages to use emitError / emitWarning / emitNote - NFC
PiperOrigin-RevId: 236169676
2019-03-29 16:50:29 -07:00
River Riddle db1757f858 Add support for named function argument attributes. The attribute dictionary is printed after the argument type:
func @arg_attrs(i32 {arg_attr: 10})

func @arg_attrs(%arg0: i32 {arg_attr: 10})

PiperOrigin-RevId: 236136830
2019-03-29 16:50:15 -07:00
Alex Zinenko 8cc50208a6 LLVM IR Dialect: unify call and call0 operations
When the LLVM IR dialect was implemented, TableGen operation definition scheme
did not support operations with variadic results.  Therefore, the `call`
instruction was split into `call` and `call0` for the single- and zero-result
calls (LLVM does not support multi-result operations).  Unify `call` and
`call0` using the recently added TableGen support for operations with Variadic
results.  Explicitly verify that the new operation has 0 or 1 results.  As a
side effect, this change enables clean-ups in the conversion to the LLVM IR
dialect that no longer needs to rely on wrapped LLVM IR void types when
constructing zero-result calls.

PiperOrigin-RevId: 236119197
2019-03-29 16:49:59 -07:00
Alex Zinenko d9cc3c31cc ExecutionEngine OptUtils: support -On flags in string-based initialization
Original implementation of OutUtils provided two different LLVM IR module
transformers to be used with the MLIR ExecutionEngine: OptimizingTransformer
parameterized by the optimization levels (similar to -O3 flags) and
LLVMPassesTransformer parameterized by the string formatted similarly to
command line options of LLVM's "opt" tool without support for -O* flags.
Introduce such support by declaring the flags inside the parser and by
populating the pass managers similarly to what "opt" does.  Remove the
additional flags from mlir-cpu-runner as they can now be wrapped into
`-llvm-opts` together with other LLVM-related flags.

PiperOrigin-RevId: 236107292
2019-03-29 16:49:44 -07:00
River Riddle 0f8c3f4071 When parsing, check that a region operation is not referencing any of the entry arguments to its block lists.
PiperOrigin-RevId: 236030438
2019-03-29 16:49:29 -07:00
Uday Bondhugula a003179367 Detect more trivially redundant constraints better
- detect more trivially redundant constraints in
  FlatAffineConstraints::removeTrivialRedundantConstraints. Redundancy due to
  constraints that only differ in the constant part (eg., 32i + 64j - 3 >= 0, 32 +
  64j - 8 >= 0) is now detected.  The method is still linear-time and does
  a single scan over the FlatAffineConstraints buffer. This detection is useful
  and needed to eliminate redundant constraints generated after FM elimination.

- update GCDTightenInequalities so that we also normalize by the GCD while at
  it. This way more constraints will show up as redundant (232i - 203 >= 0
  becomes i - 1 >= 0 instead of 232i - 232 >= 0) without having to call
  normalizeConstraintsByGCD.

- In FourierMotzkinEliminate, call GCDTightenInequalities and
  normalizeConstraintsByGCD before calling removeTrivialRedundantConstraints()
  - so that more redundant constraints are detected. As a result, redundancy
    due to constraints like i - 5 >= 0, i - 7 >= 0, 2i - 5 >= 0, 232i - 203 >=
    0 is now detected (here only i >= 7 is non-redundant).

As a result of these, a -memref-bound-check on the added test case runs in 16ms
instead of 1.35s (opt build) and no longer returns a conservative result.

PiperOrigin-RevId: 235983550
2019-03-29 16:47:59 -07:00
MLIR Team c2766f3760 Fix bug in memref region computation with slice loop bounds. Adds loop IV values to ComputationSliceState which are used in FlatAffineConstraints::addSliceBounds, to ensure that constraints are only added for loop IV values which are present in the constraint system.
PiperOrigin-RevId: 235952912
2019-03-29 16:47:29 -07:00
Lei Zhang 493d46067b [TableGen] Use result names in build() methods if possible
This will make it clear which result's type we are expecting in the build() signature.

PiperOrigin-RevId: 235925706
2019-03-29 16:46:41 -07:00
Alex Zinenko 486dde42c0 EDSC: move FileCheck tests into the source file
EDSC provide APIs for constructing and modifying the IR.  These APIs are
currently tested by a "test" module pass that reads the dummy IR (empty
functions), recognizes certain function names and injects the IR into those
functions based on their name.  This situation is unsatisfactory because the
expected outcome of the test lives in a different file than the input to the
test, i.e. the API calls.

Create a new binary for tests that constructs the IR from scratch using EDSC
APIs and prints it.  Put FileCheck comments next to the printing.  This removes
the need to have a file with dummy inputs and assert on its contents in the
test driver.  The test source includes a simplistic test harness that runs all
functions marked as TEST_FUNC but intentionally does not include any
value-testing functionality.

PiperOrigin-RevId: 235886629
2019-03-29 16:46:10 -07:00
River Riddle 03913698a8 Allow function names to have a leading underscore. This matches what is already defined in the spec, but not supported in the implementation.
PiperOrigin-RevId: 235823663
2019-03-29 16:45:08 -07:00
River Riddle 3b3e11da93 Validate the names of attribute, dialect, and functions during verification. This essentially enforces the parsing rules upon their names.
PiperOrigin-RevId: 235818842
2019-03-29 16:44:53 -07:00
River Riddle 2d4b0e2c00 Add parser support for internal named attributes. These are attributes with names starting with ':'.
PiperOrigin-RevId: 235774810
2019-03-29 16:44:22 -07:00
River Riddle cdbfd48471 Rewrite the dominance info classes to allow for operating on arbitrary control flow within operation regions. The CSE pass is also updated to properly handle nested dominance.
PiperOrigin-RevId: 235742627
2019-03-29 16:43:35 -07:00
Dimitrios Vytiniotis 41c37c6246 Unboxing for static memrefs.
When lowering to MLIR(LLVMDialect) we unbox the structs that result
from converting static memrefs, that is, singleton structs
that just contain a raw pointer. This allows us to get rid of all
"extractvalue" instructions in the common case where shapes are fully
known.

PiperOrigin-RevId: 235706021
2019-03-29 16:43:20 -07:00
Alex Zinenko 1da1b4c321 LLVM IR dialect and translation: support conditional branches with arguments
Since the goal of the LLVM IR dialect is to reflect LLVM IR in MLIR, the
dialect and the conversion procedure must account for the differences betweeen
block arguments and LLVM IR PHI nodes. In particular, LLVM IR disallows PHI
nodes with different values coming from the same source. Therefore, the LLVM IR
dialect now disallows `cond_br` operations that have identical successors
accepting arguments, which would lead to invalid PHI nodes. The conversion
process resolves the potential PHI source ambiguity by injecting dummy blocks
if the same block is used more than once as a successor in an instruction.
These dummy blocks branch unconditionally to the original successors, pass them
the original operands (available in the dummy block because it is dominated by
the original block) and are used instead of them in the original terminator
operation.

PiperOrigin-RevId: 235682798
2019-03-29 16:43:05 -07:00
Nicolas Vasilache 62c54a2ec4 Add a stripmineSink and imperfectly nested tiling primitives.
This CL adds a primitive to perform stripmining of a loop by a given factor and
sinking it under multiple target loops.
In turn this is used to implement imperfectly nested loop tiling (with interchange) by repeatedly calling the stripmineSink primitive.

The API returns the point loops and allows repeated invocations of tiling to achieve declarative, multi-level, imperfectly-nested tiling.

Note that this CL is only concerned with the mechanical aspects and does not worry about analysis and legality.

The API is demonstrated in an example which creates an EDSC block, emits the corresponding MLIR and applies imperfectly-nested tiling:

```cpp
    auto block = edsc::block({
      For(ArrayRef<edsc::Expr>{i, j}, {zero, zero}, {M, N}, {one, one}, {
        For(k1, zero, O, one, {
          C({i, j, k1}) = A({i, j, k1}) + B({i, j, k1})
        }),
        For(k2, zero, O, one, {
          C({i, j, k2}) = A({i, j, k2}) + B({i, j, k2})
        }),
      }),
    });
    // clang-format on
    emitter.emitStmts(block.getBody());

    auto l_i = emitter.getAffineForOp(i), l_j = emitter.getAffineForOp(j),
         l_k1 = emitter.getAffineForOp(k1), l_k2 = emitter.getAffineForOp(k2);
    auto indicesL1 = mlir::tile({l_i, l_j}, {512, 1024}, {l_k1, l_k2});
    auto l_ii1 = indicesL1[0][0], l_jj1 = indicesL1[1][0];
    mlir::tile({l_jj1, l_ii1}, {32, 16}, l_jj1);
```

The edsc::Expr for the induction variables (i, j, k_1, k_2) provide the programmatic hooks from which tiling can be applied declaratively.

PiperOrigin-RevId: 235548228
2019-03-29 16:41:20 -07:00
Alex Zinenko e7193a70f8 EDSC: support conditional branch instructions
Leverage the recently introduced support for multiple argument groups and
multiple destination blocks in EDSC Expressions to implement conditional
branches in EDSC.  Conditional branches have two successors and three argument
groups.  The first group contains a single expression of i1 type that
corresponds to the condition of the branch.  The two following groups contain
arguments of the two successors of the conditional branch instruction, in the
same order as the successors.  Expose this instruction to the C API and Python
bindings.

PiperOrigin-RevId: 235542768
2019-03-29 16:41:05 -07:00
Alex Zinenko 83e8db2193 EDSC: support branch instructions
The new implementation of blocks was designed to support blocks with arguments.
More specifically, StmtBlock can be constructed with a list of Bindables that
will be bound to block aguments upon construction.  Leverage this functionality
to implement branch instructions with arguments.

This additionally requires the statement storage to have a list of successors,
similarly to core IR operations.

Becauase successor chains can form loops, we need a possibility to decouple
block declaration, after which it becomes usable by branch instructions, from
block body definition.  This is achieved by creating an empty block and by
resetting its body with a new list of instructions.  Note that assigning a
block from another block will not affect any instructions that may have
designated this block as their successor (this behavior is necessary to make
value-type semantics of EDSC types consistent).  Combined, one can now write
generators like

    EDSCContext context;
    Type indexType = ...;
    Bindable i(indexType), ii(indexType), zero(indexType), one(indexType);
    StmtBlock loopBlock({i}, {});
    loopBlock.set({ii = i + one,
                   Branch(loopBlock, {ii})});
    MLIREmitter(&builder)
        .bindConstant<ConstantIndexOp>(zero, 0)
        .bindConstant<ConstantIndexOp>(one, 1)
	.emitStmt(Branch(loopBlock, {zero}));

where the emitter will emit the statement and its successors, if present.

PiperOrigin-RevId: 235541892
2019-03-29 16:40:50 -07:00
Lei Zhang 4887e45546 [TableGen] Fix infinite loop in SubstLeaves substitution
Previously we have `auto pos = std::string::find(...) != std::string::npos` as
if condition to control substring substitution. Instead of the position for the
found substring, `pos` will be a boolean value indicating found nor not. Then
used as the replace start position, we were always replacing starting from 0 or
1. If the replaced substring also has the pattern to be matched, we'll see
an infinite loop.

PiperOrigin-RevId: 235504681
2019-03-29 16:39:47 -07:00
Alex Zinenko c98a87cc06 Lower standard DivF and RemF operations to the LLVM IR dialect
Add support for lowering DivF and RemF to LLVM::FDiv and LLMV::FRem
respectively.  The lowering is a trivial one-to-one transformation.
The corresponding operations already existed in the LLVM IR dialect and can be
lowered to the LLVM IR proper.  Add the necessary tests for scalar and vector
forms.

PiperOrigin-RevId: 234984608
2019-03-29 16:36:56 -07:00
Alex Zinenko 59a209721e EDSC: support call instructions
Introduce support for binding MLIR functions as constant expressions.  Standard
constant operation supports functions as possible constant values.

Provide C APIs to look up existing named functions in an MLIR module and expose
them to the Python bindings.  Provide Python bindings to declare a function in
an MLIR module without defining it and to add a definition given a function
declaration.  These declarations are useful when attempting to link MLIR
modules with, e.g., the standard library.

Introduce EDSC support for direct and indirect calls to other MLIR functions.
Internally, an indirect call is always emitted to leverage existing support for
delayed construction of MLIR Values using EDSC Exprs.  If the expression is
bound to a constant function (looked up or declared beforehand), MLIR constant
folding will be able to replace an indirect call by a direct call.  Currently,
only zero- and one-result functions are supported since we don't have support
for multi-valued expressions in EDSC.

Expose function calling interface to Python bindings on expressions by defining
a `__call__` function accepting a variable number of arguments.

PiperOrigin-RevId: 234959444
2019-03-29 16:36:26 -07:00
Alex Zinenko 0cc24bb1af EDSC: emit composed affine maps again
The recent rework of MLIREmitter switched to using the generic call to
`builder.createOperation` from OperationState instead of individual customized
calls to `builder.create<>`.  As a result, regular non-composed affine apply
operations where emitted.  Introduce a special case in Expr::build to always
create composed affine maps instead, as it used to be the case before the
rework.

Such special-casing goes against the idea of EDSC generality and extensibility.
Instead, we should consider declaring the composed form canonical for
affine.apply operations and using the builder support for creating operations
and canonicalizing them immediately (ongoing effort).

PiperOrigin-RevId: 234790129
2019-03-29 16:34:26 -07:00
Alex Zinenko d055a4e100 EDSC: support multi-expression loop bounds
MLIR supports 'for' loops with lower(upper) bound defined by taking a
maximum(minimum) of a list of expressions, but does not have first-class affine
constructs for the maximum(minimum).  All these expressions must have affine
provenance, similarly to a single-expression bound.  Add support for
constructing such loops using EDSC.  The expression factory function is called
`edsc::MaxMinFor` to (1) highlight that the maximum(minimum) operation is
applied to the lower(upper) bound expressions and (2) differentiate it from a
`edsc::For` that creates multiple perfectly nested loops (and should arguably
be called `edsc::ForNest`).

PiperOrigin-RevId: 234785996
2019-03-29 16:33:56 -07:00
Alex Zinenko a2a433652d EDSC: create constants as expressions
Introduce a functionality to create EDSC expressions from typed constants.
This complements the current functionality that uses "unbound" expressions and
binds them to a specific constant before emission.  It comes in handy in cases
where we want to check if something is a constant early during construciton
rather than late during emission, for example multiplications and divisions in
affine expressions.  This is also consistent with MLIR vision of constants
being defined by an operation (rather than being special kinds of values in the
IR) by exposing this operation as EDSC expression.

PiperOrigin-RevId: 234758020
2019-03-29 16:33:41 -07:00
Nicolas Vasilache ffdf98d092 [EDSC] Fix Stmt::operator= and allow DimOp in For loops
This CL fixes 2 recent issues with EDSCs:
1. the type of the LHS in Stmt::operator=(Expr rhs) should be the same as the (asserted unique) return type;
2. symbols coming from DimOp should be admissible as lower / upper bounds in For

The relevant tests are added.

PiperOrigin-RevId: 234750249
2019-03-29 16:33:26 -07:00
Uday Bondhugula a1dad3a5d9 Extend/improve getSliceBounds() / complete TODO + update unionBoundingBox
- compute slices precisely where the destination iteration depends on multiple source
  iterations (instead of over-approximating to the whole source loop extent)
- update unionBoundingBox to deal with input with non-matching symbols
- reenable disabled backend test case

PiperOrigin-RevId: 234714069
2019-03-29 16:33:11 -07:00
Uday Bondhugula 5021dc4fa0 DMA placement update - hoist loops invariant DMAs
- hoist DMAs past all loops immediately surrounding the region that the latter
  is invariant on - do this at DMA generation time itself

PiperOrigin-RevId: 234628447
2019-03-29 16:32:41 -07:00
Alex Zinenko b4dba895a6 EDSC: make Expr typed and extensible
Expose the result types of edsc::Expr, which are now stored for all types of
Exprs and not only for the variadic ones.  Require return types when an Expr is
constructed, if it will ever have some.  An empty return type list is
interpreted as an Expr that does not create a value (e.g. `return` or `store`).

Conceptually, all edss::Exprs are now typed, with the type being a (potentially
empty) tuple of return types.  Unbound expressions and Bindables must now be
constructed with a specific type they will take.  This makes EDSC less
evidently type-polymorphic, but we can still write generic code such as

    Expr sumOfSquares(Expr lhs, Expr rhs) { return lhs * lhs + rhs * rhs; }

and use it to construct different typed expressions as

    sumOfSquares(Bindable(IndexType::get(ctx)), Bindable(IndexType::get(ctx)));
    sumOfSquares(Bindable(FloatType::getF32(ctx)),
                 Bindable(FloatType::getF32(ctx)));

On the positive side, we get the following.
1. We can now perform type checking when constructing Exprs rather than during
   MLIR emission.  Nevertheless, this is still duplicates the Op::verify()
   until we can factor out type checking from that.
2. MLIREmitter is significantly simplified.
3. ExprKind enum is only used for actual kinds of expressions.  Data structures
   are converging with AbstractOperation, and the users can now create a
   VariadicExpr("canonical_op_name", {types}, {exprs}) for any operation, even
   an unregistered one without having to extend the enum and make pervasive
   changes to EDSCs.

On the negative side, we get the following.
1. Typed bindables are more verbose, even in Python.
2. We lose the ability to do print debugging for higher-level EDSC abstractions
   that are implemented as multiple MLIR Ops, for example logical disjunction.

This is the step 2/n towards making EDSC extensible.

***

Move MLIR Op construction from MLIREmitter::emitExpr to Expr::build since Expr
now has sufficient information to build itself.

This is the step 3/n towards making EDSC extensible.

Both of these strive to minimize the amount of irrelevant changes.  In
particular, this introduces more complex pretty-printing for affine and binary
expression to make sure tests continue to pass.  It also relies on string
comparison to identify specific operations that an Expr produces.

PiperOrigin-RevId: 234609882
2019-03-29 16:31:26 -07:00
Alex Zinenko 0a4c940c1b EDSC: introduce support for blocks
EDSC currently implement a block as a statement that is itself a list of
statements.  This suffers from two modeling problems: (1) these blocks are not
addressable, i.e. one cannot create an instruction where thus constructed block
is a successor; (2) they support block nesting, which is not supported by MLIR
blocks.  Furthermore, emitting such "compound statement" (misleadingly named
`Block` in Python bindings) does not actually produce a new Block in the IR.

Implement support for creating actual IR Blocks in EDSC.  In particular, define
a new StmtBlock EDSC class that is neither an Expr nor a Stmt but contains a
list of Stmts.  Additionally, StmtBlock may have (early-) typed arguments.
These arguments are Bindable expressions that can be used inside the block.
Provide two calls in the MLIREmitter, `emitBlock` that actually emits a new
block and `emitBlockBody` that only emits the instructions contained in the
block without creating a new block.  In the latter case, the instructions must
not use block arguments.

Update Python bindings to make it clear when instruction emission happens
without creating a new block.

PiperOrigin-RevId: 234556474
2019-03-29 16:30:56 -07:00
Uday Bondhugula f97c1c5b06 Misc. updates/fixes to analysis utils used for DMA generation; update DMA
generation pass to make it drop certain assumptions, complete TODOs.

- multiple fixes for getMemoryFootprintBytes
  - pass loopDepth correctly from getMemoryFootprintBytes()
  - use union while computing memory footprints

- bug fixes for addAffineForOpDomain
  - take into account loop step
  - add domains of other loop IVs in turn that might have been used in the bounds

- dma-generate: drop assumption of "non-unit stride loops being tile space loops
  and skipping those and recursing to inner depths"; DMA generation is now purely
  based on available fast mem capacity and memory footprint's calculated

- handle memory region compute failures/bailouts correctly from dma-generate

- loop tiling cleanup/NFC

- update some debug and error messages to use emitNote/emitError in
  pipeline-data-transfer pass - NFC

PiperOrigin-RevId: 234245969
2019-03-29 16:30:26 -07:00
MLIR Team 58aa383e60 Support fusing producer loop nests which write to a memref which is live out, provided that the write region of the consumer loop nest to the same memref is a super set of the producer's write region.
PiperOrigin-RevId: 234240958
2019-03-29 16:30:11 -07:00
Alex Zinenko 4bb31f7377 ExecutionEngine: provide utils for running CLI-configured LLVM passes
A recent change introduced a possibility to run LLVM IR transformation during
JIT-compilation in the ExecutionEngine.  Provide helper functions that
construct IR transformers given either clang-style optimization levels or a
list passes to run.  The latter wraps the LLVM command line option parser to
parse strings rather than actual command line arguments.  As a result, we can
run either of

    mlir-cpu-runner -O3 input.mlir
    mlir-cpu-runner -some-mlir-pass -llvm-opts="-llvm-pass -other-llvm-pass"

to combine different transformations.  The transformer builder functions are
provided as a separate library that depends on LLVM pass libraries unlike the
main execution engine library.  The library can be used for integrating MLIR
execution engine into external frameworks.

PiperOrigin-RevId: 234173493
2019-03-29 16:29:41 -07:00
MLIR Team 8f5f2c765d LoopFusion: perform a series of loop interchanges to increase the loop depth at which slices of producer loop nests can be fused into constumer loop nests.
*) Adds utility to LoopUtils to perform loop interchange of two AffineForOps.
*) Adds utility to LoopUtils to sink a loop to a specified depth within a loop nest, using a series of loop interchanges.
*) Computes dependences between all loads and stores in the loop nest, and classifies each loop as parallel or sequential.
*) Computes loop interchange permutation required to sink sequential loops (and raise parallel loop nests) while preserving relative order among them.
*) Checks each dependence against the permutation to make sure that dependences would not be violated by the loop interchange transformation.
*) Calls loop interchange in LoopFusion pass on consumer loop nests before fusing in producers, sinking loops with loop carried dependences deeper into the consumer loop nest.
*) Adds and updates related unit tests.

PiperOrigin-RevId: 234158370
2019-03-29 16:29:26 -07:00
Alex Zinenko ffc9043604 LLVM dialect conversion and target: support indirect calls
Add support for converting MLIR `call_indirect` instructions to the LLVM IR
dialect.  In LLVM IR, the same instruction is used for direct and indirect
calls.  In the dialect, we have `llvm.call` and `llvm.call0` to work around the
absence of the void type in MLIR.  For direct calls, the callee is stored as
instruction attribute.  Use the same pair of instructions for indirect calls by
omitting the callee attribute.  In the MLIR to LLVM IR translator, check the
presence of attribute to decide whether to construct a direct or an indirect
call using different LLVM IR Builder functions.

Add support for converting constants of function type to the LLVM IR dialect
and for translating them to the LLVM IR proper.  The `llvm.constant` operation
works similarly to other types: its attribute has MLIR function type but the
value it produces has LLVM IR function type wrapped in the dialect type.  While
lowering, look up the pointer to the converted function in the corresponding
mapping.

PiperOrigin-RevId: 234132351
2019-03-29 16:28:56 -07:00
Alex Zinenko d7aa700ccb Dialect conversion: decouple function signature conversion from type conversion
Function types are built-in in MLIR and affect the validity of the IR itself.
However, advanced target dialects such as the LLVM IR dialect may include
custom function types.  Until now, dialect conversion was expecting function
types not to be converted to the custom type: although the signatures was
allowed to change, the outer type must have been an mlir::FunctionType.  This
effectively prevented dialect conversion from creating instructions that
operate on values of the custom function type.

Dissociate function signature conversion from general type conversion.
Function signature conversion must still produce an mlir::FunctionType and is
used in places where built-in types are required to make IR valid.  General
type conversion is used for SSA values, including function and block arguments
and function results.

Exercise this behavior in the LLVM IR dialect conversion by converting function
types to LLVM IR function pointer types.  The pointer to a function is chosen
to provide consistent lowering of higher-order functions: while it is possible
to have a value of function type, it is not possible to create a function type
accepting a returning another function type.

PiperOrigin-RevId: 234124494
2019-03-29 16:28:41 -07:00
MLIR Team affb2193cc Update direction vector computation to use FlatAffineConstraints::getLower/UpperBounds.
Update FlatAffineConstraints::getLower/UpperBounds to project to the identifier for which bounds are being computed. This change enables computing bounds on an identifier which were previously dependent on the bounds of another identifier.

PiperOrigin-RevId: 234017514
2019-03-29 16:28:25 -07:00
Alex Zinenko 50700b8122 Reimplement LLVM IR translation to use the MLIR LLVM IR dialect
Original implementation of the translation from MLIR to LLVM IR operated on the
Standard+BuiltIn dialect, with a later addition of the SuperVector dialect.
This required the translation to be aware of a potetially large number of other
dialects as the infrastructure extended.  With the recent introduction of the
LLVM IR dialect into MLIR, the translation can be switched to only translate
the LLVM IR dialect, and the translation of the operations becomes largely
mechanical.

The reimplementation of the translator follows the lines of the original
translator in function and basic block conversion.  In particular, block
arguments are converted to LLVM IR PHI nodes, which are connected to their
sources after all blocks of a function had been converted.  Thanks to LLVM IR
types being wrapped in the MLIR LLVM dialect type, type conversion is
simplified to only convert function types, all other types are simply
unwrapped.  Individual instructions are constructed using the LLVM IRBuilder,
which has a great potential for being table-generated from the LLVM IR dialect
operation definitions.

The input of the test/Target/llvmir.mlir is updated to use the MLIR LLVM IR
dialect.  While it is now redundant with the dialect conversion test, the point
of the exercise is to guarantee exactly the same LLVM IR is emitted.  (Only the
name of the allocation function is changed from `__mlir_alloc` to `alloc` in
the CHECK lines.)  It will be simplified in a follow-up commit.

PiperOrigin-RevId: 233842306
2019-03-29 16:27:10 -07:00
Jacques Pienaar 388fb3751e Add pattern constraints.
Enable matching pattern only if constraint is met. Start with type constraints and more general C++ constraints.

PiperOrigin-RevId: 233830768
2019-03-29 16:26:53 -07:00
Alex Zinenko 465746f262 LLVM IR Dialect: port DimOp lowering from the translator
DimOp is converted to a constant LLVM IR dialect operation for static
dimensions and to an access to the dynamic size info stored in the memref
descriptor for the dynamic dimensions.  This behavior is consistent with the
existing mlir-translator.

This completes the porting of MLIR -> LLVM lowering to the dialect conversion
infrastructure.

PiperOrigin-RevId: 233665634
2019-03-29 16:26:23 -07:00
Uday Bondhugula 00860662a2 Generate dealloc's for alloc's of pipeline-data-transfer
- for the DMA transfers being pipelined through double buffering, generate
  deallocs for the double buffers being alloc'ed

This change is along the lines of cl/233502632. We initially wanted to experiment with
scoped allocation - so the deallocation's were usually not necessary; however, they are
needed even with scoped allocations in some situations - for eg. when the enclosing loop
gets unrolled. The dealloc serves as an end of lifetime marker.

PiperOrigin-RevId: 233653463
2019-03-29 16:25:53 -07:00
Alex Zinenko 8de7f6c471 LLVM IR Dialect: add select op and lower standard select to it
This is a similar one-to-one mapping.

PiperOrigin-RevId: 233621006
2019-03-29 16:25:23 -07:00
Alex Zinenko ed81ddc865 EDSC: support 'for' loops with dynamic bounds
The existing implementation in EDSC of 'for' loops in MLIREmitter is
unnecessarily restricted to constant bounds.  The underlying AffineForOp can be
constructed from (a list of) Values and AffineMaps instead of constants.  Its
verifier will check that the "affine provenance" conditions, i.e. that the
values used in the loop conditions are defined in such a way that they can be
analyzed by affine passes, are respected.  One can use non-constant values in
affine loop bounds in conjunction with a single-dimensional identity affine
map.  Implement this in MLIREmitter while maintaining the special case for
constant bounds that leads to significantly simpler generated IR when
applicable.

Test this change using the EDSC lowering test pass to inject code emitted from
EDSC into functions with predefined names.

PiperOrigin-RevId: 233578220
2019-03-29 16:24:53 -07:00
Tatiana Shpeisman 2e6cd60d3b Add dialect-specific decoding for opaque constants.
Associates opaque constants with a particular dialect. Adds general mechanism to register dialect-specific hooks defined in external components. Adds hooks to decode opaque tensor constant and extract an element of an opaque tensor constant.

This CL does not change the existing mechanism for registering constant folding hook yet. One thing at a time.

PiperOrigin-RevId: 233544757
2019-03-29 16:24:38 -07:00
Uday Bondhugula 8b3f841daf Generate dealloc's for the alloc's of dma-generate.
- for the DMA buffers being allocated (and their tags), generate corresponding deallocs
- minor related update to replaceAllMemRefUsesWith and PipelineDataTransfer pass

Code generation for DMA transfers was being done with the initial simplifying
assumption that the alloc's would map to scoped allocations, and so no
deallocations would be necessary. Drop this assumption to generalize. Note that
even with scoped allocations, unrolling loops that have scoped allocations
could create a series of allocations and exhaustion of fast memory. Having a
end of lifetime marker like a dealloc in fact allows creating new scopes if
necessary when lowering to a backend and still utilize scoped allocation.
DMA buffers created by -dma-generate are guaranteed to have either
non-overlapping lifetimes or nested lifetimes.

PiperOrigin-RevId: 233502632
2019-03-29 16:24:08 -07:00
Uday Bondhugula f5eed89df0 Fix + cleanup for getMemRefRegion()
- determine symbols for the memref region correctly

- this wasn't exposed earlier since we didn't have any test cases where the
  portion of the nest being DMAed for was non-hyperrectangular (i.e., bounds of
  one IV  depending on other IVs within that part)

PiperOrigin-RevId: 233493872
2019-03-29 16:23:53 -07:00
Jacques Pienaar 7897257265 Add binary broadcastable builder.
* Add common broadcastable binary adder in TF ops and use for a few ops;
  - Adding Sub, Mul here
* Change the prepare lowering to use TF variants;
* Add some more legalization patterns;

PiperOrigin-RevId: 233310952
2019-03-29 16:23:38 -07:00
Lei Zhang a57b398906 [TableGen] Assign created ops to variables and rewrite with PatternRewriter::replaceOp()
Previously we were using PatternRewrite::replaceOpWithNewOp() to both create the new op
inline and rewrite the matched op. That does not work well if we want to generate multiple
ops in a sequence. To support that, this CL changed to assign each newly created op to a
separate variable.

This CL also refactors how PatternEmitter performs the directive dispatch logic.

PiperOrigin-RevId: 233206819
2019-03-29 16:22:53 -07:00
Alex Zinenko d7e6b33e93 Convert MemRefCastOp to the LLVM IR dialect
Add support for converting `memref_cast` operations into the LLVM IR dialect.
This goes beyond want is currently implemented in the MLIR standard ops to LLVM
IR translation, but follows the general principles of the memref descriptors.
A memref cast creates a new descriptor containing the same buffer pointer but a
potentially different number of dynamic sizes (as many as dynamic dimensions in
the target memref type).  The lowering copies the buffer pointer to the new
descriptor and inserts dynamic sizes to it.  If the size is static in the
source type, a constant value is inserted as the dynamic size, otherwise a
dynamic value is copied from the source descriptor, taking into account the
difference in dynamic size positions in the descriptor.

PiperOrigin-RevId: 233082035
2019-03-29 16:22:38 -07:00
River Riddle 366ebcf6aa Remove the restriction that only registered terminator operations may terminate a block and have block operands. This allows for any operation to hold block operands. It also introduces the notion that unregistered operations may terminate a block. As such, the 'isTerminator' api on Instruction has been split into 'isKnownTerminator' and 'isKnownNonTerminator'.
PiperOrigin-RevId: 233076831
2019-03-29 16:22:23 -07:00
Alex Zinenko 4c35bbbb51 Port load/store op translation to LLVM IR dialect lowering
Implement the lowering of memref load and store standard operations into the
LLVM IR dialect.  This largely follows the existing mechanism in
MLIR-to-LLVM-IR translation for the sake of compatibility.  A memref value is
transformed into a memref descriptor value which holds the pointer to the
underlying data buffer and the dynamic memref sizes.  The data buffer is
contiguous.  Accesses to multidimensional memrefs are linearized in row-major
form.  In linear address computation, statically known sizes are used as
constants while dynamic sizes are extracted from the memref descriptor.

PiperOrigin-RevId: 233043846
2019-03-29 16:21:53 -07:00
Uday Bondhugula c419accea3 Automated rollback of changelist 232728977.
PiperOrigin-RevId: 232944889
2019-03-29 16:21:38 -07:00
Smit Hinsu c201e6ef05 Handle dynamic shapes in Broadcastable op trait
That allows TensorFlow Add and Div ops to use Broadcastable op trait instead of
more restrictive SameValueType op trait.

That in turn allows TensorFlow ops to be registered by defining GET_OP_LIST and
including the generated ops file. Currently, tf-raise-control-flow pass tests
are using dynamic shapes in tf.Add op and AddOp can't be registered without
supporting the dynamic shapes.

TESTED with unit tests

PiperOrigin-RevId: 232927998
2019-03-29 16:21:23 -07:00
River Riddle 13a45c7194 Add verification for AffineApply/AffineFor/AffineIf dimension and symbol operands. This also allows a DimOp to be a valid dimension identifier if its operand is a valid dimension identifier.
PiperOrigin-RevId: 232923468
2019-03-29 16:21:08 -07:00
Alex Zinenko 36c0516c78 Disallow zero dimensions in vectors and memrefs
Aggregate types where at least one dimension is zero do not fully make sense as
they cannot contain any values (their total size is zero).  However, TensorFlow
and XLA support tensors with zero sizes, so we must support those too.  This is
relatively safe since, unlike vectors and memrefs, we don't have first-class
element accessors for MLIR tensors.

To support sparse element attributes of vector types that have no non-zero
elements, make sure that index and value element attributes have tensor type so
that we never need to create a zero vector type internally.  Note that this is
already consistent with the inline documentation of the sparse elements
attribute.  Users of the sparse elements attribute should not rely on the
storage schema anyway.

PiperOrigin-RevId: 232896707
2019-03-29 16:20:38 -07:00
Alex Zinenko 99b19c1d20 Disallow hexadecimal literals in type declarations
Existing IR syntax is ambiguous in type declarations in presence of zero sizes.
In particular, `0x1` in the type size can be interpreted as either a
hexadecimal literal corresponding to 1, or as two distinct decimal literals
separated by an `x` for sizes.  Furthermore, the shape `<0xi32>` fails lexing
because it is expected to be an integer literal.

Fix the lexer to treat `0xi32` as an integer literal `0` followed by a bare
identifier `xi32` (look one character ahead and early return instead of
erroring out).

Disallow hexadecimal literals in type declarations and forcibly split the token
into multiple parts while parsing the type.  Note that the splitting trick has
been already present to separate the element type from the preceding `x`
character.

PiperOrigin-RevId: 232880373
2019-03-29 16:20:22 -07:00
River Riddle a886625813 Modify the canonicalizations of select and muli to use the fold hook.
This also extends the greedy pattern rewrite driver to add the operands of folded operations back to the worklist.

PiperOrigin-RevId: 232878959
2019-03-29 16:20:06 -07:00
Uday Bondhugula 4ba8c9147d Automated rollback of changelist 232717775.
PiperOrigin-RevId: 232807986
2019-03-29 16:19:33 -07:00
River Riddle fd2d7c857b Rename the 'if' operation in the AffineOps dialect to 'affine.if' and namespace
the AffineOps dialect with 'affine'.

PiperOrigin-RevId: 232728977
2019-03-29 16:18:59 -07:00
Alex Zinenko e9493cf14d Port alloc/dealloc LLVM IR conversion into the LLVM IR dialect lowering
Implement the lowering of memref allocation and deallocation standard
operations into the LLVM IR dialect.  This largely follows the existing
mechanism in MLIR-to-LLVM-IR translation for the sake of compatibility.
A memref value is transformed into a memref descriptor value which holds the
pointer to the underlying data buffer and the dynamic memref sizes.  The buffer
is allocated using `malloc` and freed using `free`.  The lowering inserts
declarations of these functions if necessary.  Memref descriptors are values of
the LLVM IR structure type wrapped into an MLIR LLVM dialect type.  The pointer
to the buffer and the individual sizes are accessed using `extractvalue` and
`insertvalue` LLVM IR instructions.

PiperOrigin-RevId: 232719419
2019-03-29 16:18:14 -07:00
River Riddle 90d10b4e00 NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for'. The is the second step to adding a namespace to the AffineOps dialect.
PiperOrigin-RevId: 232717775
2019-03-29 16:17:59 -07:00
River Riddle 3227dee15d NFC: Rename affine_apply to affine.apply. This is the first step to adding a namespace to the affine dialect.
PiperOrigin-RevId: 232707862
2019-03-29 16:17:29 -07:00
River Riddle 0c65cf283c Move the AffineFor loop bound folding to a canonicalization pattern on the AffineForOp.
PiperOrigin-RevId: 232610715
2019-03-29 16:16:11 -07:00
River Riddle 423715056d Emit a parser error when the min/max prefix is missing from a multi value AffineFor loop bound AffineMap.
PiperOrigin-RevId: 232609693
2019-03-29 16:15:56 -07:00
River Riddle 10237de8eb Refactor the affine analysis by moving some functionality to IR and some to AffineOps. This is important for allowing the affine dialect to define canonicalizations directly on the operations instead of relying on transformation passes, e.g. ComposeAffineMaps. A summary of the refactoring:
* AffineStructures has moved to IR.

* simplifyAffineExpr/simplifyAffineMap/getFlattenedAffineExpr have moved to IR.

* makeComposedAffineApply/fullyComposeAffineMapAndOperands have moved to AffineOps.

* ComposeAffineMaps is replaced by AffineApplyOp::canonicalize and deleted.

PiperOrigin-RevId: 232586468
2019-03-29 16:15:41 -07:00
Alex Zinenko 40d5d09f9d Print parens around the return type of a function if it is also a function type
Existing type syntax contains the following productions:

    function-type ::= type-list-parens `->` type-list
    type-list ::= type | type-list-parens
    type ::= <..> | function-type

Due to these rules, when the parser sees `->` followed by `(`, it cannot
disambiguate if `(` starts a parenthesized list of function result types, or a
parenthesized list of operands of another function type, returned from the
current function.  We would need an unknown amount of lookahead to try to find
the `->` at the right level of function nesting to differentiate between type
lists and singular function types.

Instead, require the result type of the function that is a function type itself
to be always parenthesized, at the syntax level.  Update the spec and the
parser to correspond to the production rule names used in the spec (although it
would have worked without modifications).  Fix the function type parsing bug in
the process, as it used to accept the non-parenthesized list of types for
arguments, disallowed by the spec.

PiperOrigin-RevId: 232528361
2019-03-29 16:14:50 -07:00
Alex Zinenko 3fa22b88de Print non-default attribute types in optional attr dictionary
In optional attribute dictionary used, among others, in the generic form of the
ops, attribute types for integers and floats are omitted.  This could lead to
inconsistencies when round-tripping the IR, in particular the attributes are
created with incorrect types after parsing (integers default to i64, floats
default to f64).  Provide API to emit a trailing type after the attribute for
integers and floats.  Use it while printing the optional attribute dictionary.

Omitting types for i64 and f64 is a pragmatic decision that minimizes changes
in tests.  We may want to reconsider in the future and always print types of
attributes in the generic form.

PiperOrigin-RevId: 232480116
2019-03-29 16:14:05 -07:00
MLIR Team a78edcda5b Loop fusion improvements:
*) After a private memref buffer is created for a fused loop nest, dependences on the old memref are reduced, which can open up fusion opportunities. In these cases, users of the old memref are added back to the worklist to be reconsidered for fusion.
*) Fixed a bug in fusion insertion point dependence check where the memref being privatized was being skipped from the check.

PiperOrigin-RevId: 232477853
2019-03-29 16:13:50 -07:00
Jacques Pienaar 2afd655622 Add option print functions with the generic form.
The generic form may be more desirable even when there is a custom form
specified so add option to enable emitting it. This also exposes a current bug
when round tripping constant with function attribute.

PiperOrigin-RevId: 232350712
2019-03-29 16:11:38 -07:00