llvm-project

Commit Graph

Author	SHA1	Message	Date
River Riddle	f8341cfe06	Verify that the parsed predicate attribute of a cmpi operation is a string. PiperOrigin-RevId: 229419703	2019-03-29 15:18:53 -07:00
Alex Zinenko	0e58de70e7	Initial version of the LLVM IR dialect LLVM IR types are defined using MLIR's extendable type system. The dialect provides the only type kind, LLVMType, that wraps an llvm::Type*. Since LLVM IR types are pointer-unique, MLIR type systems relies on those pointers to perform its own type unique'ing. Type parsing and printing is delegated to LLVM libraries. Define MLIR operations for the LLVM IR instructions currently used by the translation to the LLVM IR Target to simplify eventual transition. Operations classes are defined using TableGen. LLVM IR instruction operands that are only allowed to take constant values are accepted as attributes instead. All operations are using verbose form for printing and parsing. PiperOrigin-RevId: 229400375	2019-03-29 15:18:37 -07:00
Alex Zinenko	44e9869f1a	TableGen: extract TypeConstraints from Type MLIR has support for type-polymorphic instructions, i.e. instructions that may take arguments of different types. For example, standard arithmetic operands take scalars, vectors or tensors. In order to express such instructions in TableGen, we need to be able to verify that a type object satisfies certain constraints, but we don't need to construct an instance of this type. The existing TableGen definition of Type requires both. Extract out a TypeConstraint TableGen class to define restrictions on types. Define the Type TableGen class as a subclass of TypeConstraint for consistency. Accept records of the TypeConstraint class instead of the Type class as values in the Arguments class when defining operators. Replace the predicate logic TableGen class based on conjunctive normal form with the predicate logic classes allowing for abitrary combinations of predicates using Boolean operators (AND/OR/NOT). The combination is implemented using simple string rewriting of C++ expressions and, therefore, respects the short-circuit evaluation order. No logic simplification is performed at the TableGen level so all expressions must be valid C++. Maintaining CNF using TableGen only would have been complicated when one needed to introduce top-level disjunction. It is also unclear if it could lead to a significantly simpler emitted C++ code. In the future, we may replace inplace predicate string combination with a tree structure that can be simplified in TableGen's C++ driver. Combined, these changes allow one to express traits like ArgumentsAreFloatLike directly in TableGen instead of relying on C++ trait classes. PiperOrigin-RevId: 229398247	2019-03-29 15:18:23 -07:00
Uday Bondhugula	4598dafa30	Parsing DmaStartOp: check if source, destination, and tag are of memref type. - fix along the lines of cl/229390720 by @riverriddle PiperOrigin-RevId: 229395218	2019-03-29 15:18:07 -07:00
River Riddle	d50dc4fd6d	When parsing DmaWait, check that the tag is a MemRef type. PiperOrigin-RevId: 229390720	2019-03-29 15:17:52 -07:00
River Riddle	25d5b895fd	When parsing Select/Cmpi standard operations, emit an error if the type does not have a valid i1 shape instead of crashing. PiperOrigin-RevId: 229384794	2019-03-29 15:17:22 -07:00
Jacques Pienaar	02ba8fd6d9	Move tests and add missing BUILD file. Updated the extracted base classes here. The test wasn't updated post the move. PiperOrigin-RevId: 229353434	2019-03-29 15:16:38 -07:00
River Riddle	6c1631b3f8	Check that at least one constraint is parsed when parsing an IntegerSet. PiperOrigin-RevId: 229248638	2019-03-29 15:15:08 -07:00
River Riddle	ed26dd0421	Add a canonicalization pattern for conditional branch to fold constant branch conditions. PiperOrigin-RevId: 229242007	2019-03-29 15:14:37 -07:00
River Riddle	06b0bd9651	Emit unsupported error when parsing a DenseElementAttr with an integer type of greater than 64 bits. DenseElementAttr currently does not support value bitwidths of > 64. This can result in asan failures and crashes when trying to invoke DenseElementsAttr::writeBits/DenseElementsAttr::readBits. PiperOrigin-RevId: 229241125	2019-03-29 15:14:23 -07:00
River Riddle	e0594ce732	Add missing return post parse failure for the indices of a sparse attribute. PiperOrigin-RevId: 229231462	2019-03-29 15:14:07 -07:00
MLIR Team	38c2fe3158	LoopFusion: automate selection of source loop nest slice depth and destination loop nest insertion depth based on a simple cost model (cost model can be extended/replaced at a later time). ) LoopFusion: Adds fusion cost function which compares the cost of the fused loop nest, with the cost of the two unfused loop nests to determine if it is profitable to fuse the candidate loop nests. The fusion cost function is run for various combinations for src/dst loop depths attempting find the minimum cost setting for src/dst loop depths which does not increase the computational cost when the loop nests are fused. Combinations of src/dst loop depth are evaluated attempting to maximize loop depth (i.e. take a bigger computation slice from the source loop nest, and insert it deeper in the destination loop nest for better locality). ) LoopFusion: Adds utility to compute op instance count for loop nests, sliced loop nests, and to compute the cost of a loop nest fused with another sliced loop nest. ) LoopFusion: canonicalizes slice bound AffineMaps (and updates related tests). ) Analysis::Utils: Splits getBackwardComputationSlice into two functions: one which calculates and returns the slice loop bounds for analysis by LoopFusion, and the other for insertion of the computation slice (ones fusion has calculated the min-cost src/dst loop depths). *) Test: Adds multiple unit tests to test the new functionality. PiperOrigin-RevId: 229219757	2019-03-29 15:13:53 -07:00
River Riddle	a674ae8bbd	Return an empty IntegerSet if the '(' is not parsed. PiperOrigin-RevId: 229198934	2019-03-29 15:13:25 -07:00
Nicolas Vasilache	0ab81776aa	Fix typo in lower_vector_transfers.mlir PiperOrigin-RevId: 229010160	2019-03-29 15:12:40 -07:00
Nicolas Vasilache	d734c50c5f	[MLIR] Clip all access dimensions during LowerVectorTransfers This CL adds a short term remedy to an issue that was found during execution tests. Lowering of vector transfer ops uses the permutation map to determine which ForInst have been super-vectorized. During materialization to HW vector sizes however, some of those dimensions may be fully unrolled and do not appear in the permutation map. Such dimensions were then not clipped and may have accessed out of bounds. This CL conservatively clips all dimensions to ensure no out of bounds access. The longer term solution is still up for debate but will probably require either passing more information between Materialization and lowering, or just merging the 2 passes. PiperOrigin-RevId: 228980787	2019-03-29 15:12:26 -07:00
Uday Bondhugula	c35d6b4f2d	Drop -canonicalize from -dma-generate test case cmd - should be testing on the output of -dma-generate and not '-dma-generate -canonicalize'; save trouble for those updating -canonicalize in the future! PiperOrigin-RevId: 228915192	2019-03-29 15:11:26 -07:00
River Riddle	3fe8eb3f22	Add check for '[' when parsing a tensor literal list. PiperOrigin-RevId: 228913908	2019-03-29 15:11:11 -07:00
Jacques Pienaar	71ec869011	Fix omitted return post failed parse PiperOrigin-RevId: 228903905	2019-03-29 15:10:25 -07:00
Lei Zhang	311af4abf3	Const fold splat vectors/tensors in standard add, sub, and mul ops The const folding logic is structurally similar, so use a template to abstract the common part. Moved mul(x, 0) to a legalization pattern to be consistent with mul(x, 1). Also promoted getZeroAttr() to be a method on Builder since it is expected to be frequently used. PiperOrigin-RevId: 228891989	2019-03-29 15:09:55 -07:00
Jacques Pienaar	78da6704b7	Verify string type token before attempting to get string value. Add repro that would have resulted in crash previously. PiperOrigin-RevId: 228890749	2019-03-29 15:09:40 -07:00
Nicolas Vasilache	cfa5831960	Uniformize composition of AffineApplyOp by construction This CL is the 5th on the path to simplifying AffineMap composition. This removes the distinction between normalized single-result AffineMap and more general composed multi-result map. One nice byproduct of making the implementation driven by single-result is that the multi-result extension is a trivial change: the implementation is still single-result and we just use: ``` unsigned idx = getIndexOf(...); map.getResult(idx); ``` This CL also fixes an AffineNormalizer implementation issue related to symbols. Namely it stops performing substitutions on symbols in AffineNormalizer and instead concatenates them all to be consistent with the call to `AffineMap::compose(AffineMap)`. This latter call to `compose` cannot perform simplifications of symbols coming from different maps based on positions only: i.e. dims are applied and renumbered but symbols must be concatenated. The only way to determine whether symbols from different AffineApply are the same is to look at the concrete values. The canonicalizeMapAndOperands is thus extended with behavior to support replacing operands that appear multiple times. Lastly, this CL demonstrates that the implementation is correct by rewriting ComposeAffineMaps using only `makeComposedAffineApply`. The implementation uses a matcher because AffineApplyOp are introduced as composed operations on the fly instead of iteratively forwardSubstituting. For this purpose, a walker would revisit freshly introduced AffineApplyOp. Regardless, ComposeAffineMaps is scheduled to disappear, this CL replaces the implementation based on iterative `forwardSubstitute` by a composed-by-construction `makeComposedAffineApply`. Remaining calls to `forwardSubstitute` will be removed in the next CL. PiperOrigin-RevId: 228830443	2019-03-29 15:08:40 -07:00
Uday Bondhugula	2370c601ba	Add safeguard against FM explosion - FM has a worst case exponential complexity. For our purposes, this worst case is rarely expected, but could still appear due to improperly constructed constraints (a logical/memory error in other methods for eg.) or artificially created arbitrarily complex integer sets (adversarial / fuzz tests). Add a check to detect such an explosion in the number of constraints and conservatively return false from isEmpty() (instead of running out of memory or running for too long). - Add an artifical virus test case. PiperOrigin-RevId: 228753496	2019-03-29 15:07:55 -07:00
Alex Zinenko	9003490287	Implement branch-free single-division lowering of affine division/remainder This implements the lowering of `floordiv`, `ceildiv` and `mod` operators from affine expressions to the arithmetic primitive operations. Integer division rules in affine expressions explicitly require rounding towards either negative or positive infinity unlike machine implementations that round towards zero. In the general case, implementing `floordiv` and `ceildiv` using machine signed division requires computing both the quotient and the remainder. When the divisor is positive, this can be simplified by adjusting the dividend and the quotient by one and switching signs. In the current use cases, we are unlikely to encounter affine expressions with negative divisors (affine divisions appear in loop transformations such as tiling that guarantee that divisors are positive by construction). Therefore, it is reasonable to use branch-free single-division implementation. In case of affine maps, divisors can only be literals so we can check the sign and implement the case for negative divisors when the need arises. The affine lowering pass can still fail when applied to semi-affine maps (division or modulo by a symbol). PiperOrigin-RevId: 228668181	2019-03-29 15:07:40 -07:00
Uday Bondhugula	742c37abc9	Fix DMA overlap pass buffer mapping - the double buffer should be indexed (iv floordiv step) % 2 and NOT (iv % 2); step wasn't being accounted for. - fix test cases, enable failing test cases PiperOrigin-RevId: 228635726	2019-03-29 15:07:10 -07:00
Alex Zinenko	6ce30becd7	Support verbose parsing and printing of terminator operations Originally, terminators were special kinds of operation and could not be extended by dialects. Only builtin terminators were supported and they had custom parsers and printers. Currently, "terminator" is a property of an operation, making it possible for dialects to define custom terminators. However, verbose forms of operation syntax were not designed to support terminators that may have a list of successors (each successor contains a block name and an optional operand list). Calling printDefaultOp on a terminator drops all successor information. Dialects are thus required to provide custom parsers and printers for their terminators. Introduce the syntax for the list of successors in the verbose from of the operation. Add support for printing and parsing verbose operations with successors. Note that this does not yet add support for unregistered terminators since "terminator" is a property stored in AsbtractOperation and therefore is only available for registered operations that have an instance of AbstractOperation. Add tests for verbose parsing. It is currently impossible to test round-trip for verbose terminators because none of the known dialects use verbose syntax for printing terminators by default, however the printer was exercised on the LLVM IR dialect prototype. PiperOrigin-RevId: 228566453	2019-03-29 15:06:26 -07:00
Uday Bondhugula	303c09299f	Fix affine expr flattener bug + improve simplification in a particular scenario - fix visitDivExpr: constraints constructed for localVarCst used the original divisor instead of the simplified divisor; fix this. Add a simple test case in memref-bound-check that reproduces this bug - although this was encountered in the context of slicing for fusion. - improve mod expr flattening: when flattening mod expressions, cancel out the GCD of the numerator and denominator so that we can get a simpler flattened form along with a simpler floordiv local var for it PiperOrigin-RevId: 228539928	2019-03-29 15:06:11 -07:00
Nicolas Vasilache	1f78d63f05	[MLIR] Make SuperVectorization use normalized AffineApplyOp Supervectorization does not plan on handling multi-result AffineMaps and non-canonical chains of > 1 AffineApplyOp. This CL uses the simpler single-result unbounded AffineApplyOp in the MaterializeVectors pass. PiperOrigin-RevId: 228469085	2019-03-29 15:05:55 -07:00
Nicolas Vasilache	c6f798a976	Introduce AffineMap::compose(AffineMap) This CL is the 2nd on the path to simplifying AffineMap composition. This CL uses the now accepted `AffineExpr::compose(AffineMap)` to implement `AffineMap::compose(AffineMap)`. Implications of keeping the simplification function in Analysis are documented where relevant. PiperOrigin-RevId: 228276646	2019-03-29 15:04:20 -07:00
River Riddle	8eccc429b7	Add parser support for named type aliases. Alias identifiers can be used in the place of the types that they alias, and are defined as: type-alias-def ::= '!' alias-name '=' 'type' type type-alias ::= '!' alias-name Example: !avx.m128 = type vector<4 x f32> ... "foo"(%x) : vector<4 x f32> -> () // becomes: "foo"(%x) : !avx.m128 -> () PiperOrigin-RevId: 228271372	2019-03-29 15:04:05 -07:00
Uday Bondhugula	e94ba6815a	Fix 0-d memref corner case for getMemRefRegion() - fix crash on test/Transforms/canonicalize.mlir with -memref-bound-check PiperOrigin-RevId: 228268486	2019-03-29 15:03:50 -07:00
Nicolas Vasilache	c449e46ceb	Introduce AffineExpr::compose(AffineMap) This CL is the 1st on the path to simplifying AffineMap composition. This CL uses the now accepted AffineExpr.replaceDimsAndSymbols to implement `AffineExpr::compose(AffineMap)`. Arguably, `simplifyAffineExpr` should be part of IR and not Analysis but this CL does not yet pull the trigger on that. PiperOrigin-RevId: 228265845	2019-03-29 15:03:36 -07:00
Uday Bondhugula	21baf86a2f	Extend loop-fusion's slicing utility + other fixes / updates - refactor toAffineFromEq and the code surrounding it; refactor code into FlatAffineConstraints::getSliceBounds - add FlatAffineConstraints methods to detect identifiers as mod's and div's of other identifiers - add FlatAffineConstraints::getConstantLower/UpperBound - Address b/122118218 (don't assert on invalid fusion depths cmdline flags - instead, don't do anything; change cmdline flags src-loop-depth -> fusion-src-loop-depth - AffineExpr/Map print method update: don't fail on null instances (since we have a wrapper around a pointer, it's avoidable); rationale: dump/print methods should never fail if possible. - Update memref-dataflow-opt to add an optimization to avoid a unnecessary call to IsRangeOneToOne when it's trivially going to be true. - Add additional test cases to exercise the new support - update a few existing test cases since the maps are now generated uniformly with all destination loop operands appearing for the backward slice - Fix projectOut - fix wrong range for getBestElimCandidate. - Fix for getConstantBoundOnDimSize() - didn't show up in any test cases since we didn't have any non-hyperrectangular ones. PiperOrigin-RevId: 228265152	2019-03-29 15:03:20 -07:00
Uday Bondhugula	b934d75b8f	Convert expr - c * (expr floordiv c) to expr mod c in AffineExpr - Detect 'mod' to replace the combination of floordiv, mul, and subtract when possible at construction time; when 'c' is a power of two, this reduces the number of operations; also more compact and readable. Update simplifyAdd for this. On a side note: - with the affine expr flattening we have, a mod expression like d0 mod c would be flattened into d0 - c * q, c * q <= d0 <= cq + c - 1, with 'q' being added as the local variable (q = d0 floordiv c); as a result, a mod was turned into a floordiv whenever the expression was reconstructed back, i.e., as d0 - c (d0 floordiv c); as a result of this change, we recover the mod back. - rename SimplifyAffineExpr -> SimplifyAffineStructures (pass had been renamed but the file hadn't been). PiperOrigin-RevId: 228258120	2019-03-29 15:02:56 -07:00
River Riddle	3b2c5600d9	Add support for types belonging to unknown dialects. This allows for types to be round tripped even if the dialect that defines them is not linked in. These types will be represented by a new "UnknownType" that uniques them based upon the dialect namespace and raw string type data. PiperOrigin-RevId: 228184629	2019-03-29 15:01:11 -07:00
Alex Zinenko	92a899f629	Drop all uses of the ForInst induction variable before deleting ForInst The `for` instruction defines the loop induction variable it uses. In the well-formed IR, the induction variable can only be used by the body of the `for` loop. Existing implementation was explicitly cleaning the body of the for loop to remove all uses of the induction variable before removing its definition. However, in ill-formed IR that may appear in some stages of parsing, there may be (invalid) users of the loop induction variable outside the loop body. In case of unsuccessful parsing, destructor of the ForInst-defined Value would assert because there are remaining though invalid users of this Value. Explicitly drop all uses of the loop induction Value when destroying a ForInst. It is no longer necessary to explicitly clean the body of the loop, destructor of the block will take care of this. PiperOrigin-RevId: 228168880	2019-03-29 15:00:26 -07:00
Alex Zinenko	3b7b0040ce	FunctionParser::~FunctionParser: avoid iterator invalidation When destroying a FunctionParser in case of parsing failure, we clean up all uses of undefined forward-declared references. This has been implemented as iteration over the list of uses. However, deleting one use from the list invalidates the iterator (`IROperand::drop` sets `nextUse` to `nullptr` while the iterator reads `nextUse` to advance; therefore only the first use was deleted from the list). Get a new iterator before calling drop to avoid invalidation. PiperOrigin-RevId: 228168849	2019-03-29 15:00:10 -07:00
Nicolas Vasilache	7c0bbe0939	Iterate on vector rather than DenseMap during AffineMap normalization This CL removes a flakyness associated to a spurious iteration on DenseMap iterators when normalizing AffineMap. PiperOrigin-RevId: 228160074	2019-03-29 14:59:37 -07:00
Alex Zinenko	c47ed53211	Add simple constant folding hook for CmpIOp Integer comparisons can be constant folded if both of their arguments are known constants, which we can compare in the compiler. This requires implementing all comparison predicates, but thanks to consistency between LLVM and MLIR comparison predicates, we have a one-to-one correspondence between predicates and llvm::APInt comparison functions. Constant folding of comparsions with maximum/minimum values of the integer type are left for future work. This will be used to test the lowering of mod/floordiv/ceildiv in affine expressions at compile time. PiperOrigin-RevId: 228077580	2019-03-29 14:59:22 -07:00
Alex Zinenko	caa7e70627	LLVM IR lowering: support integer division and remainder operations These operations trivially map to LLVM IR counterparts for operands of scalar and (one-dimensional) vector type. Multi-dimensional vector and tensor type operands would fail type conversion before the operation conversion takes place. Add tests for scalar and vector cases. Also add a test for vector `select` instruction for consistency with other tests. PiperOrigin-RevId: 228077564	2019-03-29 14:59:07 -07:00
Alex Zinenko	bc04556cf8	Introduce integer division and remainder operations This adds signed/unsigned integer division and remainder operations to the StandardOps dialect. Two versions are required because MLIR integers are signless, but the meaning of the leading bit is important in division and affects the results. LLVM IR made a similar choice. Define the operations in the tablegen file and add simple constant folding hooks in the C++ implementation. Handle signed division overflow and division by zero errors in constant folding. Canonicalization is left for future work. These operations are necessary to lower affine_apply's down to LLVM IR. PiperOrigin-RevId: 228077549	2019-03-29 14:58:52 -07:00
Uday Bondhugula	8496f2c30b	Complete TODOs / cleanup for loop-fusion utility - this is CL 1/2 that does a clean up and gets rid of one limitation in an underlying method - as a result, fusion works for more cases. - fix bugs/incomplete impl. in toAffineMapFromEq - fusing across rank changing reshapes for example now just works For eg. given a rank 1 memref to rank 2 memref reshape (64 -> 8 x 8) like this, -loop-fusion -memref-dataflow-opt now completely fuses and inlines/store-forward to get rid of the temporary: INPUT // Rank 1 -> Rank 2 reshape for %i0 = 0 to 64 { %v = load %A[%i0] store %v, %B[%i0 floordiv 8, i0 mod 8] } for %i1 = 0 to 8 for %i2 = 0 to 8 %w = load %B[%i1, i2] "foo"(%w) : (f32) -> () OUTPUT $ mlir-opt -loop-fusion -memref-dataflow-opt fuse_reshape.mlir #map0 = (d0, d1) -> (d0 * 8 + d1) mlfunc @fuse_reshape(%arg0: memref<64xf32>) { for %i0 = 0 to 8 { for %i1 = 0 to 8 { %0 = affine_apply #map0(%i0, %i1) %1 = load %arg0[%0] : memref<64xf32> "foo"(%1) : (f32) -> () } } } AFAIK, there is no polyhedral tool / compiler that can perform such fusion - because it's not really standard loop fusion, but possible through a generalized slicing-based approach such as ours. PiperOrigin-RevId: 227918338	2019-03-29 14:57:22 -07:00
Smit Hinsu	d3339ea2b8	Handle parsing failure for splat elements attribute Currently, it emits the error but does not terminate parsing. TESTED with unit test PiperOrigin-RevId: 227886274	2019-03-29 14:56:52 -07:00
Nicolas Vasilache	618c6a74c6	[MLIR] Introduce normalized single-result unbounded AffineApplyOp Supervectorization does not plan on handling multi-result AffineMaps and non-canonical chains of > 1 AffineApplyOp. This CL introduces a simpler abstraction and composition of single-result unbounded AffineApplyOp by using the existing unbound AffineMap composition. This CL adds a simple API call and relevant tests: ```c++ OpPointer<AffineApplyOp> makeNormalizedAffineApply( FuncBuilder b, Location loc, AffineMap map, ArrayRef<Value> operands); ``` which creates a single-result unbounded AffineApplyOp. The operands of AffineApplyOp are not themselves results of AffineApplyOp by consrtuction. This represent the simplest possible interface to complement the composition of (mathematical) AffineMap, for the cases when we are interested in applying it to Value*. In this CL the composed AffineMap is not compressed (i.e. there exist operands that are not part of the result). A followup commit will compress to normal form. The single-result unbounded AffineApplyOp abstraction will be used in a followup CL to support the MaterializeVectors pass. PiperOrigin-RevId: 227879021	2019-03-29 14:56:37 -07:00
Chris Lattner	7983bbc251	Introduce a simple canonicalization of affine_apply that drops unused dims and symbols. Included with this is some other infra: - Testcases for other canonicalizations that I will implement next. - Some helpers in AffineMap/Expr for doing simple walks without defining whole visitor classes. - A 'replaceDimsAndSymbols' facility that I'll be using to simplify maps and exprs, e.g. to fold one constant into a mapping and to drop/renumber unused dims. - Allow index (and everything else) to work in memref's, as we previously discussed, to make the testcase easier to write. - A "getAffineBinaryExpr" helper to produce a binop when you know the kind as an enum. This line of work will eventually subsume the ComposeAffineApply pass, but it is no where close to that yet :-) PiperOrigin-RevId: 227852951	2019-03-29 14:56:07 -07:00
River Riddle	8abc06f3d5	Implement initial support for dialect specific types. Dialect specific types are registered similarly to operations, i.e. registerType<...> within the dialect. Unlike operations, there is no notion of a "verbose" type, that is all types must be registered to a dialect. Casting support(isa/dyn_cast/etc.) is implemented by reserving a range of type kinds in the top level Type class as opposed to string comparison like operations. To support derived types a few hooks need to be implemented: In the concrete type class: - static char typeID; * A unique identifier for the type used during registration. In the Dialect: - typeParseHook and typePrintHook must be implemented to provide parser support. The syntax for dialect extended types is as follows: dialect-type: '!' dialect-namespace '<' '"' type-specific-data '"' '>' The 'type-specific-data' is information used to identify different types within the dialect, e.g: - !tf<"variant"> // Tensor Flow Variant Type - !tf<"string"> // Tensor Flow String Type TensorFlow/TensorFlowControl types are now implemented as dialect specific types as a proof of concept. PiperOrigin-RevId: 227580052	2019-03-29 14:53:07 -07:00
Alex Zinenko	0c4ee54198	Merge LowerAffineApplyPass into LowerIfAndForPass, rename to LowerAffinePass This change is mechanical and merges the LowerAffineApplyPass and LowerIfAndForPass into a single LowerAffinePass. It makes a step towards defining an "affine dialect" that would contain all polyhedral-related constructs. The motivation for merging these two passes is based on retiring MLFunctions and, eventually, transforming If and For statements into regular operations. After that happens, LowerAffinePass becomes yet another legalization. PiperOrigin-RevId: 227566113	2019-03-29 14:52:52 -07:00
Alex Zinenko	fa710c17f4	LowerForAndIf: expand affine_apply's inplace Existing implementation was created before ML/CFG unification refactoring and did not concern itself with further lowering to separate concerns. As a result, it emitted `affine_apply` instructions to implement `for` loop bounds and `if` conditions and required a follow-up function pass to lower those `affine_apply` to arithmetic primitives. In the unified function world, LowerForAndIf is mostly a lowering pass with low complexity. As we move towards a dialect for affine operations (including `for` and `if`), it makes sense to lower `for` and `if` conditions directly to arithmetic primitives instead of relying on `affine_apply`. Expose `expandAffineExpr` function in LoweringUtils. Use this function together with `expandAffineMaps` to emit primitives that implement loop and branch conditions directly. Also remove tests that become unnecessary after transforming LowerForAndIf into a function pass. PiperOrigin-RevId: 227563608	2019-03-29 14:52:22 -07:00
Chris Lattner	bbf362b784	Eliminate extfunc/cfgfunc/mlfunc as a concept, and just use 'func' instead. The entire compiler now looks at structural properties of the function (e.g. does it have one block, does it contain an if/for stmt, etc) so the only thing holding up this difference is round tripping through the parser/printer syntax. Removing this shrinks the compile by ~140LOC. This is step 31/n towards merging instructions and statements. The last step is updating the docs, which I will do as a separate patch in order to split it from this mostly mechanical patch. PiperOrigin-RevId: 227540453	2019-03-29 14:51:37 -07:00
Alex Zinenko	0565067495	LLVM IR Lowering: support "select" This commit adds support for the "select" operation that lowers directly into its LLVM IR counterpart. A simple test is included. PiperOrigin-RevId: 227527893	2019-03-29 14:51:08 -07:00
Chris Lattner	50a356d118	Simplify FunctionPass to only have a runOnFunction hook, instead of having a runOnCFG/MLFunction override locations. Passes that care can handle this filtering if they choose. Also, eliminate one needless difference between CFG/ML functions in the parser. This is step 30/n towards merging instructions and statements. PiperOrigin-RevId: 227515912	2019-03-29 14:50:53 -07:00
Nicolas Vasilache	73f5c9c380	[MLIR] Sketch a simple set of EDSCs to declaratively write MLIR This CL introduces a simple set of Embedded Domain-Specific Components (EDSCs) in MLIR components: 1. a `Type` system of shell classes that closely matches the MLIR type system. These types are subdivided into `Bindable` leaf expressions and non-bindable `Expr` expressions; 2. an `MLIREmitter` class whose purpose is to: a. maintain a map of `Bindable` leaf expressions to concrete SSAValue; b. provide helper functionality to specify bindings of `Bindable` classes to SSAValue while verifying comformable types; c. traverse the `Expr` and emit the MLIR. This is used on a concrete example to implement MemRef load/store with clipping in the LowerVectorTransfer pass. More specifically, the following pseudo-C++ code: ```c++ MLFuncBuilder *b = ...; Location location = ...; Bindable zero, one, expr, size; // EDSL expression auto access = select(expr < zero, zero, select(expr < size, expr, size - one)); auto ssaValue = MLIREmitter(b) .bind(zero, ...) .bind(one, ...) .bind(expr, ...) .bind(size, ...) .emit(location, access); ``` is used to emit all the MLIR for a clipped MemRef access. This simple EDSL can easily be extended to more powerful patterns and should serve as the counterpart to pattern matchers (and could potentially be unified once we get enough experience). In the future, most of this code should be TableGen'd but for now it has concrete valuable uses: make MLIR programmable in a declarative fashion. This CL also adds Stmt, proper supporting free functions and rewrites VectorTransferLowering fully using EDSCs. The code for creating the EDSCs emitting a VectorTransferReadOp as loops with clipped loads is: ```c++ Stmt block = Block({ tmpAlloc = alloc(tmpMemRefType), vectorView = vector_type_cast(tmpAlloc, vectorMemRefType), ForNest(ivs, lbs, ubs, steps, { scalarValue = load(scalarMemRef, accessInfo.clippedScalarAccessExprs), store(scalarValue, tmpAlloc, accessInfo.tmpAccessExprs), }), vectorValue = load(vectorView, zero), tmpDealloc = dealloc(tmpAlloc.getLHS())}); emitter.emitStmt(block); ``` where `accessInfo.clippedScalarAccessExprs)` is created with: ```c++ select(i + ii < zero, zero, select(i + ii < N, i + ii, N - one)); ``` The generated MLIR resembles: ```mlir %1 = dim %0, 0 : memref<?x?x?x?xf32> %2 = dim %0, 1 : memref<?x?x?x?xf32> %3 = dim %0, 2 : memref<?x?x?x?xf32> %4 = dim %0, 3 : memref<?x?x?x?xf32> %5 = alloc() : memref<5x4x3xf32> %6 = vector_type_cast %5 : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>> for %i4 = 0 to 3 { for %i5 = 0 to 4 { for %i6 = 0 to 5 { %7 = affine_apply #map0(%i0, %i4) %8 = cmpi "slt", %7, %c0 : index %9 = affine_apply #map0(%i0, %i4) %10 = cmpi "slt", %9, %1 : index %11 = affine_apply #map0(%i0, %i4) %12 = affine_apply #map1(%1, %c1) %13 = select %10, %11, %12 : index %14 = select %8, %c0, %13 : index %15 = affine_apply #map0(%i3, %i6) %16 = cmpi "slt", %15, %c0 : index %17 = affine_apply #map0(%i3, %i6) %18 = cmpi "slt", %17, %4 : index %19 = affine_apply #map0(%i3, %i6) %20 = affine_apply #map1(%4, %c1) %21 = select %18, %19, %20 : index %22 = select %16, %c0, %21 : index %23 = load %0[%14, %i1, %i2, %22] : memref<?x?x?x?xf32> store %23, %5[%i6, %i5, %i4] : memref<5x4x3xf32> } } } %24 = load %6[%c0] : memref<1xvector<5x4x3xf32>> dealloc %5 : memref<5x4x3xf32> ``` In particular notice that only 3 out of the 4-d accesses are clipped: this corresponds indeed to the number of dimensions in the super-vector. This CL also addresses the cleanups resulting from the review of the prevous CL and performs some refactoring to simplify the abstraction. PiperOrigin-RevId: 227367414	2019-03-29 14:50:23 -07:00
Chris Lattner	ae618428f6	Greatly simplify the ConvertToCFG pass, converting it from a module pass to a function pass, and eliminating the need to copy over code and do interprocedural updates. While here, also improve it to make fewer empty blocks, and rename it to "LowerIfAndFor" since that is what it does. This is a net reduction of ~170 lines of code. As drive-bys, change the splitBlock method to not insert an unconditional branch, since that behavior is annoying for all clients. Also improve the AsmPrinter to not crash when a block is referenced that isn't linked into a function. PiperOrigin-RevId: 227308856	2019-03-29 14:48:13 -07:00
Uday Bondhugula	b9fe6be6d4	Introduce memref store to load forwarding - a simple memref dataflow analysis - the load/store forwarding relies on memref dependence routines as well as SSA/dominance to identify the memref store instance uniquely supplying a value to a memref load, and replaces the result of that load with the value being stored. The memref is also deleted when possible if only stores remain. - add methods for post dominance for MLFunction blocks. - remove duplicated getLoopDepth/getNestingDepth - move getNestingDepth, getMemRefAccess, getNumCommonSurroundingLoops into Analysis/Utils (were earlier static) - add a helper method in FlatAffineConstraints - isRangeOneToOne. PiperOrigin-RevId: 227252907	2019-03-29 14:47:28 -07:00
Uday Bondhugula	6e3462d251	Fix b/122139732; update FlatAffineConstraints::isEmpty() to eliminate IDs in a better order. - update isEmpty() to eliminate IDs in a better order. Speed improvement for complex cases (for eg. high-d reshape's involving mod's/div's). - minor efficiency update to projectOut (was earlier making an extra albeit benign call to gaussianEliminateIds) (NFC). - move getBestIdToEliminate further up in the file (NFC). - add the failing test case. - add debug info to checkMemRefAccessDependence. PiperOrigin-RevId: 227244634	2019-03-29 14:47:13 -07:00
Chris Lattner	8ef2552df7	Have the asmprinter take advantage of the new capabilities of the asmparser, by printing the entry block in a CFG function's argument line. Since I'm touching all of the testcases anyway, change the argument list from printing as "%arg : type" to "%arg: type" which is more consistent with bb arguments. In addition to being more consistent, this is a much nicer look for cfg functions. PiperOrigin-RevId: 227240069	2019-03-29 14:46:29 -07:00
Chris Lattner	aaa1d77e96	Clean up and improve the parser handling of basic block labels, now that we have a designator. This improves diagnostics and merges handling between CFG and ML functions more. This also eliminates hard coded parser knowledge of terminator keywords, allowing dialects to define their own terminators. PiperOrigin-RevId: 227239398	2019-03-29 14:46:13 -07:00
Chris Lattner	37579ae8c4	Introduce ^ as a basic block sigil, eliminating an ambiguity on the MLIR syntax. PiperOrigin-RevId: 227234174	2019-03-29 14:45:59 -07:00
Chris Lattner	56e2a6cc3b	Merge the verifier logic for all functions into a unified framework, this requires enhancing DominanceInfo to handle the structure of an ML function, which is required anyway. Along the way, this also fixes a const correctness problem with Instruction::getBlock(). This is step 24/n towards merging instructions and statements. PiperOrigin-RevId: 227228900	2019-03-29 14:45:43 -07:00
Chris Lattner	4a96a11d6d	Enhance parsing of CFG and Ext functions to optionally allow named arguments in the function signature, giving them common functionality to ml functions. This is a strictly additive patch that adds new capability without changing behavior in a significant way (other than a few diagnostic cleanups). A subsequent patch will change the printer to use this behavior, which will require updating a ton of testcases. :) This exposes the fact that we need to make a grammar change for block arguments, as is tracked by b/122119779 This is step 23/n towards merging instructions and statements, and one of the first steps towards eliminating the "cfg vs ml" distinction at a syntax and semantic level. PiperOrigin-RevId: 227228342	2019-03-29 14:45:28 -07:00
Chris Lattner	5b9c3f7cdb	Tidy up references to "basic blocks" that should refer to blocks now. NFC. PiperOrigin-RevId: 227196077	2019-03-29 14:44:59 -07:00
Chris Lattner	be9ee4a98e	Merge parser logic for CFG and ML functions, shrinking the code by ~80 lines. This causes a slight change to diagnostics, but is otherwise behavior preserving. This is step 22/n towards merging instructions and statements, MFC. PiperOrigin-RevId: 227187857	2019-03-29 14:44:44 -07:00
Chris Lattner	456ad6a8e0	Standardize naming of statements -> instructions, revisting the code base to be consistent and moving the using declarations over. Hopefully this is the last truly massive patch in this refactoring. This is step 21/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227178245	2019-03-29 14:44:30 -07:00
Uday Bondhugula	b1d9cc4d1e	Extend/complete dependence tester to utilize local var info. - extend/complete dependence tester to utilize local var info while adding access function equality constraints; one more step closer to get slicing based fusion working in the general case of affine_apply's involving mod's/div's. - update test case to reflect more accurate dependence information; remove inaccurate comment on test case mod_deps. - fix a minor "bug" in equality addition in addMemRefAccessConstraints (doesn't affect correctness, but the fixed version is more intuitive). - some more surrounding code clean up - move simplifyAffineExpr out of anonymous AffineExprFlattener class - the latter has state, and the former should reside outside. PiperOrigin-RevId: 227175600	2019-03-29 14:44:14 -07:00
Chris Lattner	5187cfcf03	Merge Operation into OperationInst and standardize nomenclature around OperationInst. This is a big mechanical patch. This is step 16/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227093712	2019-03-29 14:42:23 -07:00
Uday Bondhugula	294687ef59	Fix affine expr flattener bug introduced by cl/225452174. - inconsistent local var constraint size when repeatedly using the same flattener for all expressions in a map. PiperOrigin-RevId: 227067836	2019-03-29 14:40:37 -07:00
Alex Zinenko	eb0f9f37af	SuperVectorization: fix 'isa' assertion Supervectorization uses null pointers to SSA values as a means of communicating the failure to vectorize. In operation vectorization, all operations producing the values of operation arguments must be vectorized for the given operation to be vectorized. The existing check verified if any of the value "def" statements was vectorized instead, sometimes leading to assertions inside `isa` called on a null pointer. Fix this to check that all "def" statements were vectorized. PiperOrigin-RevId: 226941552	2019-03-29 14:37:20 -07:00
Alex Zinenko	9403f80dd3	LLVM IR lowering: support SubIOp and SubFOp The binary subtraction operations were not supported by the lowering because they were not essential for the testing flow. Add support for these operations. PiperOrigin-RevId: 226941463	2019-03-29 14:37:05 -07:00
MLIR Team	4eef795a1d	Computation slice update: adds parameters to insertBackwardComputationSlice which specify the source loop nest depth at which to perform iteration space slicing, and the destination loop nest depth at which to insert the compution slice. Updates LoopFusion pass to take these parameters as command line flags for experimentation. PiperOrigin-RevId: 226514297	2019-03-29 14:35:03 -07:00
Jacques Pienaar	7e24010382	Expand rewriter gen to handle string attributes in output. * Extend to handle rewrite patterns with output attributes; - Constant attributes are defined with a value and a type; - The type of the value is mapped to the corresponding attribute type (string -> StringAttr); * Verifies the type of operands in the resultant matches the defined op's operands; PiperOrigin-RevId: 226468908	2019-03-29 14:34:31 -07:00
MLIR Team	bcb7c4742d	Do proper indexing for local variables when building access function equality constraints (working on test cases). PiperOrigin-RevId: 226399089	2019-03-29 14:34:02 -07:00
MLIR Team	2570fb5bb7	Address some issues from memref dependence check bug (b/121216762), adds tests cases. PiperOrigin-RevId: 226277453	2019-03-29 14:33:17 -07:00
MLIR Team	6892ffb896	Improve loop fusion algorithm by using a memref dependence graph. Fixed TODO for reduction fusion unit test. PiperOrigin-RevId: 226277226	2019-03-29 14:33:02 -07:00
Uday Bondhugula	14d2618f63	Simplify memref-dependence-check's meta data structures / drop duplication and reuse existing ones. - drop IterationDomainContext, redundant since FlatAffineConstraints has MLValue information associated with its dimensions. - refactor to use existing support - leads to a reduction in LOC - as a result of these changes, non-constant loop bounds get naturally supported for dep analysis. - update test cases to include a couple with non-constant loop bounds - rename addBoundsFromForStmt -> addForStmtDomain - complete TODO for getLoopIVs (handle 'if' statements) PiperOrigin-RevId: 226082008	2019-03-29 14:32:46 -07:00
Uday Bondhugula	1d72f2e47e	Update / complete a TODO for addBoundsForForStmt - when adding constraints from a 'for' stmt into FlatAffineConstraints, correctly add bound operands of the 'for' stmt as a dimensional identifier or a symbolic identifier depending on whether the bound operand is a valid MLFunction symbol - update test case to exercise this. PiperOrigin-RevId: 225988511	2019-03-29 14:32:31 -07:00
Uday Bondhugula	20531932f4	Refactor/update memref-dep-check's addMemRefAccessConstraints and addDomainConstraints; add support for mod/div for dependence testing. - add support for mod/div expressions in dependence analysis - refactor addMemRefAccessConstraints to use getFlattenedAffineExprs (instead of getFlattenedAffineExpr); update addDomainConstraints. - rename AffineExprFlattener::cst -> localVarCst PiperOrigin-RevId: 225933306	2019-03-29 14:31:58 -07:00
Alex Zinenko	51c8a095a3	Materialize vector_type_cast operation in the SuperVector dialect This operation is produced and used by the super-vectorization passes and has been emitted as an abstract unregistered operation until now. For end-to-end testing purposes, it has to be eventually lowered to LLVM IR. Matching abstract operation by name goes into the opposite direction of the generic lowering approach that is expected to be used for LLVM IR lowering in the future. Register vector_type_cast operation as a part of the SuperVector dialect. Arguably, this operation is a special case of the `view` operation from the Standard dialect. The semantics of `view` is not fully specified at this point so it is safer to rely on a custom operation. Additionally, using a custom operation may help to achieve clear dialect separation. PiperOrigin-RevId: 225887305	2019-03-29 14:31:13 -07:00
MLIR Team	3b69230b3a	Loop Fusion pass update: introduce utilities to perform generalized loop fusion based on slicing; encompasses standard loop fusion. ) Adds simple greedy fusion algorithm to drive experimentation. This algorithm greedily fuses loop nests with single-writer/single-reader memref dependences to improve locality. ) Adds support for fusing slices of a loop nest computation: fusing one loop nest into another by adjusting the source loop nest's iteration bounds (after it is fused into the destination loop nest). This is accomplished by solving for the source loop nest's IVs in terms of the destination loop nests IVs and symbols using the dependece polyhedron, then creating AffineMaps of these functions for the loop bounds of the fused source loop. ) Adds utility function 'insertMemRefComputationSlice' which computes and inserts computation slice from loop nest surrounding a source memref access into the loop nest surrounding the destingation memref access. ) Adds FlatAffineConstraints::toAffineMap function which returns and AffineMap which represents an equality contraint where one dimension identifier is represented as a function of all others in the equality constraint. *) Adds multiple fusion unit tests. PiperOrigin-RevId: 225842944	2019-03-29 14:30:13 -07:00
Jacques Pienaar	49c4d2a630	Fix builder getFloatAttr of double to use F64 type and use fltSemantics in FloatAttr. Store FloatAttr using more appropriate fltSemantics (mostly fixing up F32/F64 storage, F16/BF16 pending). Previously F32 type was used incorrectly for double (the storage was double). Also add query method that returns fltSemantics for IEEE fp types and use that to verify that the APfloat given matches the type: * FloatAttr created using APFloat is verified that the semantics of the type and APFloat matches; * FloatAttr created using double has the APFloat created to match the semantics of the type; Change parsing of tensor negative splat element to pass in the element type expected. Misc other changes to account for the storage type matching the attribute. PiperOrigin-RevId: 225821834	2019-03-29 14:29:58 -07:00
Uday Bondhugula	c41ee60647	'memref-bound-check': extend to store op's as well - extend memref-bound-check to store op's - make the bound check an analysis util and move to lib/Analysis/Utils.cpp (so that one doesn't need to always create a pass to use it) PiperOrigin-RevId: 225564830	2019-03-29 14:29:13 -07:00
Uday Bondhugula	45a0f52519	Expression flattening improvement - reuse local expressions. - if a local id was already for a specific mod/div expression, just reuse it if the expression repeats (instead of adding a new one). - drastically reduces the number of local variables added during flattening for real use cases - since the same div's and mod expressions often repeat. - add getFlattenedAffineExprs for AffineMap, IntegerSet based on the above As a natural result of the above: - FlatAffineConstraints(IntegerSet) ctor now deals with integer sets that have mod and div constraints as well, and these get simplified as well from -simplify-affine-structures PiperOrigin-RevId: 225452174	2019-03-29 14:28:13 -07:00
Uday Bondhugula	8365bdc17f	FlatAffineConstraints - complete TODOs: add method to remove duplicate / trivially redundant constraints. Update projectOut to eliminate identifiers in a more efficient order. Fix b/120801118. - add method to remove duplicate / trivially redundant constraints from FlatAffineConstraints (use a hashing-based approach with DenseSet) - update projectOut to eliminate identifiers in a more efficient order (A sequence of affine_apply's like this (from a real use case) finally exposed the lack of the above trivial/low hanging simplifications). for %ii = 0 to 64 { for %jj = 0 to 9 { %a0 = affine_apply (d0, d1) -> (d0 * (9 * 1024) + d1 * 128) (%ii, %jj) %a1 = affine_apply (d0) -> (d0 floordiv (2 * 3 * 3 * 128 * 128), (d0 mod 294912) floordiv (3 * 3 * 128 * 128), (((d0 mod 294912) mod 147456) floordiv 1152) floordiv 8, (((d0 mod 294912) mod 147456) mod 1152) floordiv 384, ((((d0 mod 294912) mod 147456) mod 1152) mod 384) floordiv 128, (((((d0 mod 294912) mod 147456) mod 1152) mod 384) mod 128) floordiv 128) (%a0) %v0 = load %in[%a1tensorflow/mlir#0, %a1tensorflow/mlir#1, %a1tensorflow/mlir#3, %a1tensorflow/mlir#4, %a1tensorflow/mlir#2, %a1tensorflow/mlir#5] : memref<2x2x3x3x16x1xi32> } } - update FlatAffineConstraints::print to print number of constraints. PiperOrigin-RevId: 225397480	2019-03-29 14:27:29 -07:00
Uday Bondhugula	4860f0e8fd	Fix loop unrolling test cases - These test cases had to be updated post the switch to exclusive upper bound; however, the test cases hadn't originally been written to check correctly; as a result, they didn't fail and weren't updated. Update test case and fix upper bound. PiperOrigin-RevId: 225194016	2019-03-29 14:26:56 -07:00
Alex Zinenko	359835eb27	LLVM IR lowering: support 1D vector operations Introduce initial support for 1D vector operations. LLVM does not support higher-dimensional vectors so the caller must make sure they don't appear in the input MLIR. Handle the presence of higher-dimensional vectors by failing gracefully. Introduce the type conversion for 1D vector types and hook it up with the rest of the type convresion system. Support "splat" constants for vector types. As a side effect, this refactors constant operation emission by separating out scalar integer constants into a separate case and by extracting out the helper function for scalar float construction. Existing binary operations apply to vectors transparently. PiperOrigin-RevId: 225172349	2019-03-29 14:26:37 -07:00
Alex Zinenko	97d2f3cd3d	ConvertToCFG: use affine_apply to implement loop steps Originally, loop steps were implemented using `addi` and `constant` operations because `affine_apply` was not handled in the first implementation. The support for `affine_apply` has been added, use it to implement the update of the loop induction variable. This is more consistent with the lower and upper bounds of the loop that are also implemented as `affine_apply`, removes the dependence of the converted function on the StandardOps dialect and makes it clear from the CFG function that all operations on the loop induction variable are purely affine. PiperOrigin-RevId: 225165337	2019-03-29 14:26:22 -07:00
Alex Zinenko	63261aa9a8	Disallow index types as elements of vector, memref and tensor types An extensive discussion demonstrated that it is difficult to support `index` types as elements of compound (vector, memref, tensor) types. In particular, their size is unknown until the target-specific lowering takes place. MLIR may need to store constants of the fixed-shape compound types (e.g., vector<4 x index>) internally and must know the size of the element type and data layout constraints. The same information is necessary for target-specific lowering and translation to reliably support compound types with `index` elements, but MLIR does not have a dedicated target description mechanism yet. The uses cases for compound types with `index` elements, should they appear, can be handled via an `index_cast` operation that converts between `index` and fixed-size integer types at the SSA value level instead of the type level. PiperOrigin-RevId: 225064373	2019-03-29 14:25:22 -07:00
Uday Bondhugula	b9f53dc0bd	Update/Fix LoopUtils::stmtBodySkew to handle loop step. - loop step wasn't handled and there wasn't a TODO or an assertion; fix this. - rename 'delay' to shift for consistency/readability. - other readability changes. - remove duplicate attribute print for DmaStartOp; fix misplaced attribute print for DmaWaitOp - add build method for AddFOp (unrelated to this CL, but add it anyway) PiperOrigin-RevId: 224892958	2019-03-29 14:25:07 -07:00
Uday Bondhugula	d59a95a05c	Fix missing check for dependent DMAs in pipeline-data-transfer - adding a conservative check for now (TODO: use the dependence analysis pass once the latter is extended to deal with DMA ops). resolve an existing bug on a test case. - update test cases PiperOrigin-RevId: 224869526	2019-03-29 14:24:53 -07:00
Uday Bondhugula	6757fb151d	FlatAffineConstraints API cleanup; add normalizeConstraintsByGCD(). - add method normalizeConstraintsByGCD - call normalizeConstraintsByGCD() and GCDTightenInequalities() at the end of projectOut. - remove call to GCDTightenInequalities() from getMemRefRegion - change isEmpty() to check isEmptyByGCDTest() / hasInvalidConstraint() each time an identifier is eliminated (to detect emptiness early). - make FourierMotzkinEliminate, gaussianEliminateId(s), GCDTightenInequalities() private - improve / update stale comments PiperOrigin-RevId: 224866741	2019-03-29 14:24:37 -07:00
Uday Bondhugula	2ef57806ba	Update/fix -pipeline-data-transfer; fix b/120770946 - fix replaceAllMemRefUsesWith call to replace only inside loop body. - handle the case where DMA buffers are dynamic; extend doubleBuffer() method to handle dynamically shaped DMA buffers (pass the right operands to AllocOp) - place alloc's for DMA buffers at the depth at which pipelining is being done (instead of at top-level) - add more test cases PiperOrigin-RevId: 224852231	2019-03-29 14:24:22 -07:00
Uday Bondhugula	2d6478fa92	Extend loop tiling utility to handle non-constant loop bounds and bounds that are a max/min of several expressions. - Extend loop tiling to handle non-constant loop bounds and bounds that are a max/min of several expressions, i.e., bounds using multi-result affine maps - also fix b/120630124 as a result (the IR was in an invalid state when tiled loop generation failed; SSA uses were created that weren't plugged into the IR). PiperOrigin-RevId: 224604460	2019-03-29 14:23:34 -07:00
Uday Bondhugula	dfc752e42b	Generate strided DMAs from -dma-generate - generate DMAs correctly now using strided DMAs where needed - add support for multi-level/nested strides; op still supports one level of stride for now. Other things - add test case for symbolic lower/upper bound; cases where the DMA buffer size can't be bounded by a known constant - add test case for dynamic shapes where the DMA buffers are however bounded by constants - refactor some of the '-dma-generate' code PiperOrigin-RevId: 224584529	2019-03-29 14:23:19 -07:00
Nicolas Vasilache	d9b6420fc9	[MLIR] Add LowerVectorTransfersPass This CL adds a pass that lowers VectorTransferReadOp and VectorTransferWriteOp to a simple loop nest via local buffer allocations. This is an MLIR->MLIR lowering based on builders. A few TODOs are left to address in particular: 1. invert the permutation map so the accesses to the remote memref are coalesced; 2. pad the alloc for bank conflicts in local memory (e.g. GPUs shared_memory); 3. support broadcast / avoid copies when permutation_map is not of full column rank 4. add a proper "element_cast" op One notable limitation is this does not plan on supporting boundary conditions. It should be significantly easier to use pre-baked MLIR functions to handle such paddings. This is left for future consideration. Therefore the current CL only works properly for full-tile cases atm. This CL also adds 2 simple tests: ```mlir for %i0 = 0 to %M step 3 { for %i1 = 0 to %N step 4 { for %i2 = 0 to %O { for %i3 = 0 to %P step 5 { vector_transfer_write %f1, %A, %i0, %i1, %i2, %i3 {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d0)} : vector<5x4x3xf32>, memref<?x?x?x?xf32, 0>, index, index, index, index ``` lowers into: ```mlir for %i0 = 0 to %arg0 step 3 { for %i1 = 0 to %arg1 step 4 { for %i2 = 0 to %arg2 { for %i3 = 0 to %arg3 step 5 { %1 = alloc() : memref<5x4x3xf32> %2 = "element_type_cast"(%1) : (memref<5x4x3xf32>) -> memref<1xvector<5x4x3xf32>> store %cst, %2[%c0] : memref<1xvector<5x4x3xf32>> for %i4 = 0 to 5 { %3 = affine_apply (d0, d1) -> (d0 + d1) (%i3, %i4) for %i5 = 0 to 4 { %4 = affine_apply (d0, d1) -> (d0 + d1) (%i1, %i5) for %i6 = 0 to 3 { %5 = affine_apply (d0, d1) -> (d0 + d1) (%i0, %i6) %6 = load %1[%i4, %i5, %i6] : memref<5x4x3xf32> store %6, %0[%5, %4, %i2, %3] : memref<?x?x?x?xf32> dealloc %1 : memref<5x4x3xf32> ``` and ```mlir for %i0 = 0 to %M step 3 { for %i1 = 0 to %N { for %i2 = 0 to %O { for %i3 = 0 to %P step 5 { %f = vector_transfer_read %A, %i0, %i1, %i2, %i3 {permutation_map: (d0, d1, d2, d3) -> (d3, 0, d0)} : (memref<?x?x?x?xf32, 0>, index, index, index, index) -> vector<5x4x3xf32> ``` lowers into: ```mlir for %i0 = 0 to %arg0 step 3 { for %i1 = 0 to %arg1 { for %i2 = 0 to %arg2 { for %i3 = 0 to %arg3 step 5 { %1 = alloc() : memref<5x4x3xf32> %2 = "element_type_cast"(%1) : (memref<5x4x3xf32>) -> memref<1xvector<5x4x3xf32>> for %i4 = 0 to 5 { %3 = affine_apply (d0, d1) -> (d0 + d1) (%i3, %i4) for %i5 = 0 to 4 { for %i6 = 0 to 3 { %4 = affine_apply (d0, d1) -> (d0 + d1) (%i0, %i6) %5 = load %0[%4, %i1, %i2, %3] : memref<?x?x?x?xf32> store %5, %1[%i4, %i5, %i6] : memref<5x4x3xf32> %6 = load %2[%c0] : memref<1xvector<5x4x3xf32>> dealloc %1 : memref<5x4x3xf32> ``` PiperOrigin-RevId: 224552717	2019-03-29 14:23:05 -07:00
Nicolas Vasilache	4adc169bd0	[MLIR] Add AffineMap composition and use it in Materialization This CL adds the following free functions: ``` /// Returns the AffineExpr e o m. AffineExpr compose(AffineExpr e, AffineMap m); /// Returns the AffineExpr f o g. AffineMap compose(AffineMap f, AffineMap g); ``` This addresses the issue that AffineMap composition is only available at a distance via AffineValueMap and is thus unusable on Attributes. This CL thus implements AffineMap composition in a more modular and composable way. This CL does not claim that it can be a good replacement for the implementation in AffineValueMap, in particular it does not support bounded maps atm. Standalone tests are added that replicate some of the logic of the AffineMap composition pass. Lastly, affine map composition is used properly inside MaterializeVectors and a standalone test is added that requires permutation_map composition with a projection map. PiperOrigin-RevId: 224376870	2019-03-29 14:20:22 -07:00
Nicolas Vasilache	df0a25efee	[MLIR] Add support for permutation_map This CL hooks up and uses permutation_map in vector_transfer ops. In particular, when going into the nuts and bolts of the implementation, it became clear that cases arose that required supporting broadcast semantics. Broadcast semantics are thus added to the general permutation_map. The verify methods and tests are updated accordingly. Examples of interest include. Example 1: The following MLIR snippet: ```mlir for %i3 = 0 to %M { for %i4 = 0 to %N { for %i5 = 0 to %P { %a5 = load %A[%i4, %i5, %i3] : memref<?x?x?xf32> }}} ``` may vectorize with {permutation_map: (d0, d1, d2) -> (d2, d1)} into: ```mlir for %i3 = 0 to %0 step 32 { for %i4 = 0 to %1 { for %i5 = 0 to %2 step 256 { %4 = vector_transfer_read %arg0, %i4, %i5, %i3 {permutation_map: (d0, d1, d2) -> (d2, d1)} : (memref<?x?x?xf32>, index, index) -> vector<32x256xf32> }}} ```` Meaning that vector_transfer_read will be responsible for reading the 2-D slice: `%arg0[%i4, %i5:%15+256, %i3:%i3+32]` into vector<32x256xf32>. This will require a transposition when vector_transfer_read is further lowered. Example 2: The following MLIR snippet: ```mlir %cst0 = constant 0 : index for %i0 = 0 to %M { %a0 = load %A[%cst0, %cst0] : memref<?x?xf32> } ``` may vectorize with {permutation_map: (d0) -> (0)} into: ```mlir for %i0 = 0 to %0 step 128 { %3 = vector_transfer_read %arg0, %c0_0, %c0_0 {permutation_map: (d0, d1) -> (0)} : (memref<?x?xf32>, index, index) -> vector<128xf32> } ```` Meaning that vector_transfer_read will be responsible of reading the 0-D slice `%arg0[%c0, %c0]` into vector<128xf32>. This will require a 1-D vector broadcast when vector_transfer_read is further lowered. Additionally, some minor cleanups and refactorings are performed. One notable thing missing here is the composition with a projection map during materialization. This is because I could not find an AffineMap composition that operates on AffineMap directly: everything related to composition seems to require going through SSAValue and only operates on AffinMap at a distance via AffineValueMap. I have raised this concern a bunch of times already, the followup CL will actually do something about it. In the meantime, the projection is hacked at a minimum to pass verification and materialiation tests are temporarily incorrect. PiperOrigin-RevId: 224376828	2019-03-29 14:20:07 -07:00
Alex Zinenko	7c89a225cf	ConvertToCFG: support min/max in loop bounds. The recently introduced `select` operation enables ConvertToCFG to support min(max) in loop bounds. Individual min(max) is implemented as `cmpi "lt"`(`cmpi "gt"`) followed by a `select` between the compared values. Multiple results of an `affine_apply` operation extracted from the loop bounds are reduced using min(max) in a sequential manner. While this may decrease the potential for instruction-level parallelism, it is easier to recognize for the following passes, in particular for the vectorizer. PiperOrigin-RevId: 224376233	2019-03-29 14:19:52 -07:00
MLIR Team	a53ed1b767	Fix bug in GCD calculation when flattening AffineExpr (adds unit test which triggers the bug and tests the fix). PiperOrigin-RevId: 224246657	2019-03-29 14:19:07 -07:00
Uday Bondhugula	9f77faae87	Strided DMA support for DmaStartOp - add optional stride arguments for DmaStartOp - add DmaStartOp::verify(), and missing test cases for DMA op's in test/IR/memory-ops.mlir. PiperOrigin-RevId: 224232466	2019-03-29 14:18:37 -07:00
Uday Bondhugula	a92130880e	Complete multiple unhandled cases for DmaGeneration / getMemRefRegion; update/improve/clean up API. - update FlatAffineConstraints::getConstBoundDifference; return constant differences between symbolic affine expressions, look at equalities as well. - fix buffer size computation when generating DMAs symbolic in outer loops, correctly handle symbols at various places (affine access maps, loop bounds, loop IVs outer to the depth at which DMA generation is being done) - bug fixes / complete some TODOs for getMemRefRegion - refactor common code b/w memref dependence check and getMemRefRegion - FlatAffineConstraints API update; added methods employ trivial checks / detection - sufficient to handle hyper-rectangular cases in a precise way while being fast / low complexity. Hyper-rectangular cases fall out as trivial cases for these methods while other cases still do not cause failure (either return conservative or return failure that is handled by the caller). PiperOrigin-RevId: 224229879	2019-03-29 14:18:22 -07:00
MLIR Team	753109547d	During forward substitution, merge symbols from input AffineMap with the symbol list of the target AffineMap. Symbols can be used as dim identifiers and symbolic identifiers, and so we must preserve the symbolic identifies from the input AffineMap during forward substitution, even if that same identifier is used as a dimension identifier in the target AffineMap. Test case added. Going forward, we may want to explore solutions where we do not maintain this split between dimensions and symbols, and instead verify the validity of each use of each AffineMap operand AffineMap in the context where the AffineMap operand usage is required to be a symbol: in the denominator of floordiv/ceildiv/mod for semi-affine maps, and in instructions that can capture symbols (i.e. alloc) PiperOrigin-RevId: 224017364	2019-03-29 14:16:40 -07:00
Alex Zinenko	7868abd9d8	ConvertToCFG: convert "if" statements. The condition of the "if" statement is an integer set, defined as a conjunction of affine constraints. An affine constraints consists of an affine expression and a flag indicating whether the expression is strictly equal to zero or is also allowed to be greater than zero. Affine maps, accepted by `affine_apply` are also formed from affine expressions. Leverage this fact to implement the checking of "if" conditions. Each affine expression from the integer set is converted into an affine map. This map is applied to the arguments of the "if" statement. The result of the application is compared with zero given the equality flag to obtain the final boolean value. The conjunction of conditions is tested sequentially with short-circuit branching to the "else" branch if any of the condition evaluates to false. Create an SESE region for the if statement (including its "then" and optional "else" statement blocks) and append it to the end of the current region. The conditional region consists of a sequence of condition-checking blocks that implement the short-circuit scheme, followed by a "then" SESE region and an "else" SESE region, and the continuation block that post-dominates all blocks of the "if" statement. The flow of blocks that correspond to the "then" and "else" clauses are constructed recursively, enabling easy nesting of "if" statements and if-then-else-if chains. Note that MLIR semantics does not require nor prohibit short-circuit evaluation. Since affine expressions do not have side effects, there is no observable difference in the program behavior. We may trade off extra operations for operation-level parallelism opportunity by first performing all `affine_apply` and comparison operations independently, and then performing a tree pattern reduction of the resulting boolean values with the `muli i1` operations (in absence of the dedicated bit operations). The pros and cons are not clear, and since MLIR does not include parallel semantics, we prefer to minimize the number of sequentially executed operations. PiperOrigin-RevId: 223970248	2019-03-29 14:16:10 -07:00
Alex Zinenko	dee51d0961	LLVM IR Lowering: support multi-value returns. Unlike MLIR, LLVM IR does not support functions that return multiple values. Simulate this by packing values into the LLVM structure type in the same order as they appear in the MLIR return. If the function returns only a single value, return it directly without packing. PiperOrigin-RevId: 223964886	2019-03-29 14:15:56 -07:00
Nicolas Vasilache	ebb3d38471	[MLIR] Separate and split vectorization tests These tests have become too bulky and unwiedly. Splitting simplifies modifications that will occur in the next CL. PiperOrigin-RevId: 223874321	2019-03-29 14:15:40 -07:00
Nicolas Vasilache	b39d1f0bdb	[MLIR] Add VectorTransferOps This CL implements and uses VectorTransferOps in lieu of the former custom call op. Tests are updated accordingly. VectorTransferOps come in 2 flavors: VectorTransferReadOp and VectorTransferWriteOp. VectorTransferOps can be thought of as a backend-independent pseudo op/library call that needs to be legalized to MLIR (whiteboxed) before it can be lowered to backend-dependent IR. Note that the current implementation does not yet support a real permutation map. Proper support will come in a followup CL. VectorTransferReadOp ==================== VectorTransferReadOp performs a blocking read from a scalar memref location into a super-vector of the same elemental type. This operation is called 'read' by opposition to 'load' because the super-vector granularity is generally not representable with a single hardware register. As a consequence, memory transfers will generally be required when lowering VectorTransferReadOp. A VectorTransferReadOp is thus a mid-level abstraction that supports super-vectorization with non-effecting padding for full-tile only code. A vector transfer read has semantics similar to a vector load, with additional support for: 1. an optional value of the elemental type of the MemRef. This value supports non-effecting padding and is inserted in places where the vector read exceeds the MemRef bounds. If the value is not specified, the access is statically guaranteed to be within bounds; 2. an attribute of type AffineMap to specify a slice of the original MemRef access and its transposition into the super-vector shape. The permutation_map is an unbounded AffineMap that must represent a permutation from the MemRef dim space projected onto the vector dim space. Example: ```mlir %A = alloc(%size1, %size2, %size3, %size4) : memref<?x?x?x?xf32> ... %val = `ssa-value` : f32 // let %i, %j, %k, %l be ssa-values of type index %v0 = vector_transfer_read %src, %i, %j, %k, %l {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} : (memref<?x?x?x?xf32>, index, index, index, index) -> vector<16x32x64xf32> %v1 = vector_transfer_read %src, %i, %j, %k, %l, %val {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} : (memref<?x?x?x?xf32>, index, index, index, index, f32) -> vector<16x32x64xf32> ``` VectorTransferWriteOp ===================== VectorTransferWriteOp performs a blocking write from a super-vector to a scalar memref of the same elemental type. This operation is called 'write' by opposition to 'store' because the super-vector granularity is generally not representable with a single hardware register. As a consequence, memory transfers will generally be required when lowering VectorTransferWriteOp. A VectorTransferWriteOp is thus a mid-level abstraction that supports super-vectorization with non-effecting padding for full-tile only code. A vector transfer write has semantics similar to a vector store, with additional support for handling out-of-bounds situations. Example: ```mlir %A = alloc(%size1, %size2, %size3, %size4) : memref<?x?x?x?xf32>. %val = `ssa-value` : vector<16x32x64xf32> // let %i, %j, %k, %l be ssa-values of type index vector_transfer_write %val, %src, %i, %j, %k, %l {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} : (vector<16x32x64xf32>, memref<?x?x?x?xf32>, index, index, index, index) ``` PiperOrigin-RevId: 223873234	2019-03-29 14:15:25 -07:00
Uday Bondhugula	89c41fdca1	FlatAffineConstraints::composeMap: return failure instead of asserting on semi-affine maps FlatAffineConstraints::composeMap: should return false instead of asserting on a semi-affine map. Make getMemRefRegion just propagate false when encountering semi-affine maps (instead of crashing!) PiperOrigin-RevId: 223828743	2019-03-29 14:14:56 -07:00
Uday Bondhugula	5f76245cfe	Minor fix for replaceAllMemRefUsesWith. The check for whether the memref was used in a non-derefencing context had to be done inside, i.e., only for the op stmt's that the replacement was specified to be performed on (by the domStmtFilter arg if provided). As such, it is completely fine for example for a function to return a memref while the replacement is being performed only a specific loop's body (as in the case of DMA generation). PiperOrigin-RevId: 223827753	2019-03-29 14:14:43 -07:00
River Riddle	7669a259c4	Add a simple common sub expression elimination pass. The algorithm collects defining operations within a scoped hash table. The scopes within the hash table correspond to nodes within the dominance tree for a function. This cl only adds support for simple operations, i.e non side-effecting. Such operations, e.g. load/store/call, will be handled in later patches. PiperOrigin-RevId: 223811328	2019-03-29 14:14:28 -07:00
Jacques Pienaar	45e3139bc8	RankedTensorType: Use getHashValue(KeyTy) when calling getHashValue(RankedTensorTypeStorage*). PiperOrigin-RevId: 223649958	2019-03-29 14:13:44 -07:00
Nicolas Vasilache	1ae66f6520	[MLIR] Reenable materialize_vectors test Fixes one of the Filecheck'ed test which was mistakenly disabled. PiperOrigin-RevId: 223401978	2019-03-29 14:12:40 -07:00
Lei Zhang	1f5330ac90	Verify CmpIOp's result type to be bool-like This CL added two new traits, SameOperandsAndResultShape and ResultsAreBoolLike, and changed CmpIOp to embody these two traits. As a consequence, CmpIOp's result type now is verified to be bool-like. PiperOrigin-RevId: 223208438	2019-03-29 14:11:53 -07:00
Alex Zinenko	a3fb6d0da3	StandardOps: introduce 'select'. The semantics of 'select' is conventional: return the second operand if the first operand is true (1 : i1) and the third operand otherwise. It is applicable to vectors and tensors element-wise, similarly to LLVM instruction. This operation is necessary to implement min/max to lower 'for' loops with complex bounds to CFG functions and to support ternary operations in ML functions. It is preferred to first-class min/max because of its simplicity, e.g. it is not concered with signedness. PiperOrigin-RevId: 223160860	2019-03-29 14:11:25 -07:00
Alex Zinenko	e7f43c8361	LLVM IR lowering: support 'dim' operation. Add support for translating 'dim' opreation on MemRefs to LLVM IR. For a static size, this operation merely defines an LLVM IR constant value that may not appear in the output IR if not used (and had not been removed before by DCE). For a dynamic size, this operation is translated into an access to the MemRef descriptor that contains the dynamic size. PiperOrigin-RevId: 223160774	2019-03-29 14:11:10 -07:00
Alex Zinenko	90d1b6b5f2	LLVM IR lowering: support simple MemRef types Introduce initial support for MemRef types, including type conversion, allocation and deallocation, read and write element-wise access, passing MemRefs to and returning from functions. Affine map compositions and non-default memory spaces are NOT YET supported. Lowered code needs to handle potentially dynamic sizes of the MemRef. To do so, it replaces a MemRef-typed value with a special MemRef descriptor that carries the data and the dynamic sizes together. A MemRef type is converted to LLVM's first-class structure type with the first element being the pointer to the data buffer with data layed out linearly, followed by as many integer-typed elements as MemRef has dynamic sizes. The type of these elements is that of MLIR index lowered to LLVM. For example, `memref<?x42x?xf32>` is converted to `{ f32, i64, i64 }` provided `index` is lowered to `i64`. While it is possible to convert MemRefs with fully static sizes to simple pointers to their elemental types, we opted for consistency and convert them to the single-element structure. This makes the conversion code simpler and the calling convention of the generated LLVM IR functions consistent. Loads from and stores to a MemRef element are lowered to a sequence of LLVM instructions that, first, computes the linearized index of the element in the data buffer using the access indices and combining the static sizes with the dynamic sizes stored in the descriptor, and then loads from or stores to the buffer element indexed by the linearized subscript. While some of the index computations may be redundant (i.e., consecutive load and store to the same location in the same scope could reuse the linearized index), we emit them for every operation. A subsequent optimization pass may eliminate them if necessary. MemRef allocation and deallocation is performed using external functions `__mlir_alloc(index) -> i8` and `__mlir_free(i8*)` that must be implemented by the caller. These functions behave similarly to `malloc` and `free`, but can be extended to support different memory spaces in future. Allocation and deallocation instructions take care of casting the pointers. Prior to calling the allocation function, the emitted code creates an SSA Value for the descriptor and uses it to store the dynamic sizes of the MemRef passed to the allocation operation. It further emits instructions that compute the dynamic amount of memory to allocate in bytes. Finally, the allocation stores the result of calling the `__mlir_alloc` in the MemRef descriptor. Deallocation extracts the pointer to the allocated memory from the descriptor and calls `__mlir_free` on it. The descriptor itself is not modified and, being stack-allocated, ceases to exist when it goes out of scope. MLIR functions that access MemRef values as arguments or return them are converted to LLVM IR functions that accept MemRef descriptors as LLVM IR structure types by value. This significantly simplifies the calling convention at the LLVM IR level and avoids handling descriptors in the dynamic memory, however is not always comaptible with LLVM IR functions emitted from C code with similar signatures. A separate LLVM pass may be introduced in the future to provide C-compatible calling conventions for LLVM IR functions generated from MLIR. PiperOrigin-RevId: 223134883	2019-03-29 14:10:55 -07:00
Alex Zinenko	68e9721aa8	Rename Deaffinator to LowerAffineApply and patch it. Several things were suggested in post-submission reviews. In particular, use pointers in function interfaces instead of references (still use references internally). Clarify the behavior of the pass in presence of MLFunctions. PiperOrigin-RevId: 222556851	2019-03-29 14:08:59 -07:00
Nicolas Vasilache	a5782f0d40	[MLIR][MaterializeVectors] Add a MaterializeVector pass via unrolling. This CL adds an MLIR-MLIR pass which materializes super-vectors to hardware-dependent sized vectors. While the physical vector size is target-dependent, the pass is written in a target-independent way: the target vector size is specified as a parameter to the pass. This pass is thus a partial lowering that opens the "greybox" that is the super-vector abstraction. This first CL adds a first materilization pass iterates over vector_transfer_write operations and: 1. computes the program slice including the current vector_transfer_write; 2. computes the multi-dimensional ratio of super-vector shape to hardware vector shape; 3. for each possible multi-dimensional value within the bounds of ratio, a new slice is instantiated (i.e. cloned and rewritten) so that all operations in this instance operate on the hardware vector type. As a simple example, given: ```mlir mlfunc @vector_add_2d(%M : index, %N : index) -> memref<?x?xf32> { %A = alloc (%M, %N) : memref<?x?xf32> %B = alloc (%M, %N) : memref<?x?xf32> %C = alloc (%M, %N) : memref<?x?xf32> for %i0 = 0 to %M { for %i1 = 0 to %N { %a1 = load %A[%i0, %i1] : memref<?x?xf32> %b1 = load %B[%i0, %i1] : memref<?x?xf32> %s1 = addf %a1, %b1 : f32 store %s1, %C[%i0, %i1] : memref<?x?xf32> } } return %C : memref<?x?xf32> } ``` and the following options: ``` -vectorize -virtual-vector-size 32 --test-fastest-varying=0 -materialize-vectors -vector-size=8 ``` materialization emits: ```mlir #map0 = (d0, d1) -> (d0, d1) #map1 = (d0, d1) -> (d0, d1 + 8) #map2 = (d0, d1) -> (d0, d1 + 16) #map3 = (d0, d1) -> (d0, d1 + 24) mlfunc @vector_add_2d(%arg0 : index, %arg1 : index) -> memref<?x?xf32> { %0 = alloc(%arg0, %arg1) : memref<?x?xf32> %1 = alloc(%arg0, %arg1) : memref<?x?xf32> %2 = alloc(%arg0, %arg1) : memref<?x?xf32> for %i0 = 0 to %arg0 { for %i1 = 0 to %arg1 step 32 { %3 = affine_apply #map0(%i0, %i1) %4 = "vector_transfer_read"(%0, %3tensorflow/mlir#0, %3tensorflow/mlir#1) : (memref<?x?xf32>, index, index) -> vector<8xf32> %5 = affine_apply #map1(%i0, %i1) %6 = "vector_transfer_read"(%0, %5tensorflow/mlir#0, %5tensorflow/mlir#1) : (memref<?x?xf32>, index, index) -> vector<8xf32> %7 = affine_apply #map2(%i0, %i1) %8 = "vector_transfer_read"(%0, %7tensorflow/mlir#0, %7tensorflow/mlir#1) : (memref<?x?xf32>, index, index) -> vector<8xf32> %9 = affine_apply #map3(%i0, %i1) %10 = "vector_transfer_read"(%0, %9tensorflow/mlir#0, %9tensorflow/mlir#1) : (memref<?x?xf32>, index, index) -> vector<8xf32> %11 = affine_apply #map0(%i0, %i1) %12 = "vector_transfer_read"(%1, %11tensorflow/mlir#0, %11tensorflow/mlir#1) : (memref<?x?xf32>, index, index) -> vector<8xf32> %13 = affine_apply #map1(%i0, %i1) %14 = "vector_transfer_read"(%1, %13tensorflow/mlir#0, %13tensorflow/mlir#1) : (memref<?x?xf32>, index, index) -> vector<8xf32> %15 = affine_apply #map2(%i0, %i1) %16 = "vector_transfer_read"(%1, %15tensorflow/mlir#0, %15tensorflow/mlir#1) : (memref<?x?xf32>, index, index) -> vector<8xf32> %17 = affine_apply #map3(%i0, %i1) %18 = "vector_transfer_read"(%1, %17tensorflow/mlir#0, %17tensorflow/mlir#1) : (memref<?x?xf32>, index, index) -> vector<8xf32> %19 = addf %4, %12 : vector<8xf32> %20 = addf %6, %14 : vector<8xf32> %21 = addf %8, %16 : vector<8xf32> %22 = addf %10, %18 : vector<8xf32> %23 = affine_apply #map0(%i0, %i1) "vector_transfer_write"(%19, %2, %23tensorflow/mlir#0, %23tensorflow/mlir#1) : (vector<8xf32>, memref<?x?xf32>, index, index) -> () %24 = affine_apply #map1(%i0, %i1) "vector_transfer_write"(%20, %2, %24tensorflow/mlir#0, %24tensorflow/mlir#1) : (vector<8xf32>, memref<?x?xf32>, index, index) -> () %25 = affine_apply #map2(%i0, %i1) "vector_transfer_write"(%21, %2, %25tensorflow/mlir#0, %25tensorflow/mlir#1) : (vector<8xf32>, memref<?x?xf32>, index, index) -> () %26 = affine_apply #map3(%i0, %i1) "vector_transfer_write"(%22, %2, %26tensorflow/mlir#0, %26tensorflow/mlir#1) : (vector<8xf32>, memref<?x?xf32>, index, index) -> () } } return %2 : memref<?x?xf32> } ``` PiperOrigin-RevId: 222455351	2019-03-29 14:08:31 -07:00
Nicolas Vasilache	5c16564bca	[MLIR][Slicing] Add utils for computing slices. This CL adds tooling for computing slices as an independent CL. The first consumer of this analysis will be super-vector materialization in a followup CL. In particular, this adds: 1. a getForwardStaticSlice function with documentation, example and a standalone unit test; 2. a getBackwardStaticSlice function with documentation, example and a standalone unit test; 3. a getStaticSlice function with documentation, example and a standalone unit test; 4. a topologicalSort function that is exercised through the getStaticSlice unit test. The getXXXStaticSlice functions take an additional root (resp. terminators) parameter which acts as a boundary that the transitive propagation algorithm is not allowed to cross. PiperOrigin-RevId: 222446208	2019-03-29 14:08:02 -07:00
Uday Bondhugula	2631b155a9	Fix bugs in DMA generation and FlatAffineConstraints; add more test cases. - fix bug in calculating index expressions for DMA buffers in certain cases (affected tiled loop nests); add more test cases for better coverage. - introduce an additional optional argument to replaceAllMemRefUsesWith; additional operands to the index remap AffineMap can now be supplied by the client. - FlatAffineConstraints::addBoundsForStmt - fix off by one upper bound, ::composeMap - fix position bug. - Some clean up and more comments PiperOrigin-RevId: 222434628	2019-03-29 14:07:31 -07:00
Alex Zinenko	615c41c788	Introduce Deaffinator pass. This function pass replaces affine_apply operations in CFG functions with sequences of primitive arithmetic instructions that form the affine map. The actual replacement functionality is located in LoweringUtils as a standalone function operating on an individual affine_apply operation and inserting the result at the location of the original operation. It is expected to be useful for other, target-specific lowering passes that may start at MLFunction level that Deaffinator does not support. PiperOrigin-RevId: 222406692	2019-03-29 14:07:16 -07:00
Alex Zinenko	ac6bfa6780	Lower scalar parts of CFG functions to LLVM IR Initial restricted implementaiton of the MLIR to LLVM IR translation. Introduce a new flow into the mlir-translate tool taking an MLIR module containing CFG functions only and producing and LLVM IR module. The MLIR features supported by the translator are as follows: - primitive and function types; - integer constants; - cfg and ext functions with 0 or 1 return values; - calls to these functions; - basic block conversion translation of arguments to phi nodes; - conversion between arguments of the first basic block and function arguments; - (conditional) branches; - integer addition and comparison operations. Are NOT supported: - vector and tensor types and operations on them; - memrefs and operations on them; - allocations; - functions returning multiple values; - LLVM Module triple and data layout (index type is hardcoded to i64). Create a new MLIR library and place it under lib/Target/LLVMIR. The "Target" library group is similar to the one present in LLVM and is intended to contain all future public MLIR translation targets. The general flow of MLIR to LLVM IR convresion will include several lowering and simplification passes on the MLIR itself in order to make the translation as simple as possible. In particular, ML functions should be transformed to CFG functions by the recently introduced pass, operations on structured types will be converted to sequences of operations on primitive types, complex operations such as affine_apply will be converted into sequence of primitive operations, primitive operations themselves may eventually be converted to an LLVM dialect that uses LLVM-like operations. Introduce the first translation test so that further changes make sure the basic translation functionality is not broken. PiperOrigin-RevId: 222400112	2019-03-29 14:07:01 -07:00
Alex Zinenko	6c5317eafa	Separate translators into "from MLIR" and "to MLIR". Translations performed by mlir-translate only have MLIR on one end. MLIR-to-MLIR conversions (including dialect changes) should be treated as passes and run by mlir-opt. Individual translations should not care about reading or writing MLIR and should work on in-memory representation of MLIR modules instead. Split the TranslateFunction interface and the translate registry into two parts: "from MLIR" and "to MLIR". Update mlir-translate to handle both registries together by wrapping translation functions into source-to-source convresions. Remove MLIR parsing and writing from individual translations and make them operate on Modules instead. This removes the need for individual translators to include tools/mlir-translate/mlir-translate.h, which can now be safely removed. Remove mlir-to-mlir translation that only existed as a registration example and use mlir-opt instead for tests. PiperOrigin-RevId: 222398707	2019-03-29 14:06:33 -07:00
River Riddle	1cfe508316	Add verifier check for integer constants to check that the value can fit within the type bit width. PiperOrigin-RevId: 222335526	2019-03-29 14:05:48 -07:00
Uday Bondhugula	b6c03917ad	Remove allocations for memref's that become dead as a result of double buffering in the auto DMA overlap pass. This is done online in the pass. PiperOrigin-RevId: 222313640	2019-03-29 14:05:19 -07:00
Uday Bondhugula	0328217eb8	Automated rollback of changelist 221863955. PiperOrigin-RevId: 222299120	2019-03-29 14:04:05 -07:00
Nicolas Vasilache	87d46aaf4b	[MLIR][Vectorize] Refactor Vectorize use-def propagation. This CL refactors a few things in Vectorize.cpp: 1. a clear distinction is made between: a. the LoadOp are the roots of vectorization and must be vectorized eagerly and propagate their value; and b. the StoreOp which are the terminals of vectorization and must be vectorized late (i.e. they do not produce values that need to be propagated). 2. the StoreOp must be vectorized late because in general it can store a value that is not reachable from the subset of loads defined in the current pattern. One trivial such case is storing a constant defined at the top-level of the MLFunction and that needs to be turned into a splat. 3. a description of the algorithm is given; 4. the implementation matches the algorithm; 5. the last example is made parametric, in practice it will fully rely on the implementation of vector_transfer_read/write which will handle boundary conditions and padding. This will happen by lowering to a lower-level abstraction either: a. directly in MLIR (whether DMA or just loops or any async tasks in the future) (whiteboxing); b. in LLO/LLVM-IR/whatever blackbox library call/ search + swizzle inventor one may want to use; c. a partial mix of a. and b. (grey-boxing) 5. minor cleanups are applied; 6. mistakenly disabled unit tests are re-enabled (oopsie). With this CL, this MLIR snippet: ``` mlfunc @vector_add_2d(%M : index, %N : index) -> memref<?x?xf32> { %A = alloc (%M, %N) : memref<?x?xf32> %B = alloc (%M, %N) : memref<?x?xf32> %C = alloc (%M, %N) : memref<?x?xf32> %f1 = constant 1.0 : f32 %f2 = constant 2.0 : f32 for %i0 = 0 to %M { for %i1 = 0 to %N { // non-scoped %f1 store %f1, %A[%i0, %i1] : memref<?x?xf32> } } for %i4 = 0 to %M { for %i5 = 0 to %N { %a5 = load %A[%i4, %i5] : memref<?x?xf32> %b5 = load %B[%i4, %i5] : memref<?x?xf32> %s5 = addf %a5, %b5 : f32 // non-scoped %f1 %s6 = addf %s5, %f1 : f32 store %s6, %C[%i4, %i5] : memref<?x?xf32> } } return %C : memref<?x?xf32> } ``` vectorized with these arguments: ``` -vectorize -virtual-vector-size 256 --test-fastest-varying=0 ``` vectorization produces this standard innermost-loop vectorized code: ``` mlfunc @vector_add_2d(%arg0 : index, %arg1 : index) -> memref<?x?xf32> { %0 = alloc(%arg0, %arg1) : memref<?x?xf32> %1 = alloc(%arg0, %arg1) : memref<?x?xf32> %2 = alloc(%arg0, %arg1) : memref<?x?xf32> %cst = constant 1.000000e+00 : f32 %cst_0 = constant 2.000000e+00 : f32 for %i0 = 0 to %arg0 { for %i1 = 0 to %arg1 step 256 { %cst_1 = constant splat<vector<256xf32>, 1.000000e+00> : vector<256xf32> "vector_transfer_write"(%cst_1, %0, %i0, %i1) : (vector<256xf32>, memref<?x?xf32>, index, index) -> () } } for %i2 = 0 to %arg0 { for %i3 = 0 to %arg1 step 256 { %3 = "vector_transfer_read"(%0, %i2, %i3) : (memref<?x?xf32>, index, index) -> vector<256xf32> %4 = "vector_transfer_read"(%1, %i2, %i3) : (memref<?x?xf32>, index, index) -> vector<256xf32> %5 = addf %3, %4 : vector<256xf32> %cst_2 = constant splat<vector<256xf32>, 1.000000e+00> : vector<256xf32> %6 = addf %5, %cst_2 : vector<256xf32> "vector_transfer_write"(%6, %2, %i2, %i3) : (vector<256xf32>, memref<?x?xf32>, index, index) -> () } } return %2 : memref<?x?xf32> } ``` Of course, much more intricate n-D imperfectly-nested patterns can be emitted too in a fully declarative fashion, but this is enough for now. PiperOrigin-RevId: 222280209	2019-03-29 14:03:50 -07:00
Alex Zinenko	f986d5920b	ConvertToCFG: handle loop 1D affine loop bounds. In the general case, loop bounds can be expressed as affine maps of the outer loop iterators and function arguments. Relax the check for loop bounds to be known integer constants and also accept one-dimensional affine bounds in ConvertToCFG ForStmt lowering. Emit affine_apply operations for both the upper and the lower bound. The semantics of MLFunctions guarantees that both bounds can be computed before the loop starts iterating. Constant bounds are merely a short-hand notation for zero-dimensional affine maps and get supported transparently. Multidimensional affine bounds are not yet supported because the target IR dialect lacks min/max operations necessary to implement the corresponding semantics. PiperOrigin-RevId: 222275801	2019-03-29 14:03:20 -07:00
Jacques Pienaar	d0590caa90	Add op stats pass to mlir-opt. op-stats pass currently returns the number of occurrences of different operations in a Module. Useful for verifying transformation properties (e.g., 3 ops of specific dialect, 0 of another), but probably not useful outside of that so keeping it local to mlir-opt. This does not consider op attributes when counting. PiperOrigin-RevId: 222259727	2019-03-29 14:02:46 -07:00
Nicolas Vasilache	89d9913a20	[MLIR][VectorAnalysis] Add a VectorAnalysis and standalone tests This CL adds some vector support in prevision of the upcoming vector materialization pass. In particular this CL adds 2 functions to: 1. compute the multiplicity of a subvector shape in a supervector shape; 2. help match operations on strict super-vectors. This is defined for a given subvector shape as an operation that manipulates a vector type that is an integral multiple of the subtype, with multiplicity at least 2. This CL also adds a TestUtil pass where we can dump arbitrary testing of functions and analysis that operate at a much smaller granularity than a pass (e.g. an analysis for which it is convenient to write a bit of artificial MLIR and write some custom test). This is in order to keep using Filecheck for things that essentially look and feel like C++ unit tests. PiperOrigin-RevId: 222250910	2019-03-29 14:02:17 -07:00
Jacques Pienaar	64c6d3946c	Change pretty printing of constant so that the attributes precede the value. This does create an inconsistency between the print formats (e.g., attributes are normally before operands) but fixes an invalid parsing & keeps constant uniform wrt itself (function or int attributes have type at same place). And specifying the specific type for a int/float attribute might get revised shortly. Also add test to verify that output printed can be parsed again. PiperOrigin-RevId: 221923893	2019-03-29 14:01:05 -07:00
Uday Bondhugula	fff1efbaf5	Updates to transformation/analysis passes/utilities. Update DMA generation pass and getMemRefRegion() to work with specified loop depths; add support for outgoing DMAs, store op's. - add support for getMemRefRegion symbolic in outer loops - hence support for DMAs symbolic in outer surrounding loops. - add DMA generation support for outgoing DMAs (store op's to lower memory space); extend getMemoryRegion to store op's. -memref-bound-check now works with store op's as well. - fix dma-generate (references to the old memref in the dma_start op were also being replaced with the new buffer); we need replace all memref uses to work only on a subset of the uses - add a new optional argument for replaceAllMemRefUsesWith. update replaceAllMemRefUsesWith to take an optional 'operation' argument to serve as a filter - if provided, only those uses that are dominated by the filter are replaced. - Add missing print for attributes for dma_start, dma_wait op's. - update the FlatAffineConstraints API PiperOrigin-RevId: 221889223	2019-03-29 14:00:51 -07:00
Uday Bondhugula	6b52ac3aa6	Mark AllocOp as being free of side effects PiperOrigin-RevId: 221863955	2019-03-29 14:00:37 -07:00
Jacques Pienaar	711047c0cd	Add Type to int/float attributes. * Optionally attach the type of integer and floating point attributes to the attributes, this allows restricting a int/float to specific width. - Currently this allows suffixing int/float constant with type [this might be revised in future]. - Default to i64 and f32 if not specified. * For index types the APInt width used is 64. * Change callers to request a specific attribute type. * Store iN type with APInt of width N. * This change does not handle the folding of constants of different types (e.g., doing int type promotions to support constant folding i3 and i32), and instead restricts the constant folding to only operate on the same types. PiperOrigin-RevId: 221722699	2019-03-29 13:59:23 -07:00
River Riddle	503caf0722	Replace TerminatorInst with builtin terminator operations. Note: Terminators will be merged into the operations list in a follow up patch. PiperOrigin-RevId: 221670037	2019-03-29 13:58:55 -07:00
Alex Zinenko	d030433443	ConvertToCFG: properly remap nested function attributes. Array attributes can nested and function attributes can appear anywhere at that level. They should be remapped to point to the generated CFGFunction after ML-to-CFG conversion, similarly to plain function attributes. Extract the nested attribute remapping functionality from the Parser to Utils. Extract out the remapping function for individual Functions from the module remapping function. Use these new functions in the ML-to-CFG conversion pass and in the parser. PiperOrigin-RevId: 221510997	2019-03-29 13:57:58 -07:00
Nicolas Vasilache	fefbf91314	[MLIR] Support for vectorizing operations. This CL adds support for and a vectorization test to perform scalar 2-D addf. The support extension notably comprises: 1. extend vectorizable test to exclude vector_transfer operations and expose them to LoopAnalysis where they are needed. This is a temporary solution a concrete MLIR Op exists; 2. add some more functional sugar mapKeys, apply and ScopeGuard (which became relevant again); 3. fix improper shifting during coarsening; 4. rename unaligned load/store to vector_transfer_read/write and simplify the design removing the unnecessary AllocOp that were introduced prematurely: vector_transfer_read currently has the form: (memref<?x?x?xf32>, index, index, index) -> vector<32x64x256xf32> vector_transfer_write currently has the form: (vector<32x64x256xf32>, memref<?x?x?xf32>, index, index, index) -> () 5. adds vectorizeOperations which traverses the operations in a ForStmt and rewrites them to their vector form; 6. add support for vector splat from a constant. The relevant tests are also updated. PiperOrigin-RevId: 221421426	2019-03-29 13:56:47 -07:00
Alex Zinenko	cab24dc211	Homogenize branch instruction arguments. Branch instruction arguments were defined and used inconsistently across different instructions, in both the spec and the implementation. In particular, conditional and unconditional branch instructions were using different syntax in the implementation. This led to the IR we produce not being accepted by the parser. Update the printer to use common syntax: `(` list-of-SSA-uses `:` list-of-types `)`. The motivation for choosing this syntax as opposed to the one in the spec, `(` list-of-SSA-uses `)` `:` list-of-types is double-fold. First, it is tricky to differentiate the label of the false branch from the type while parsing conditional branches (which is what apparently motivated the implementation to diverge from the spec in the first place). Second, the ongoing convergence between terminator instructions and other operations prompts for consistency between their operand list syntax. After this change, the only remaining difference between the two is the use of parentheses. Update the comment of the parser that did not correspond to the code. Remove the unused isParenthesized argument from parseSSAUseAndTypeList. Update the spec accordingly. Note that the examples in the spec were _not_ using the EBNF defined a couple of lines above them, but were using the current syntax. Add a supplementary example of a branch to a basic block with multiple arguments. PiperOrigin-RevId: 221162655	2019-03-29 13:55:36 -07:00
Alex Zinenko	5a0d3d0204	Basic conversion of MLFunctions to CFGFunctions. Implement a pass converting a subset of MLFunctions to CFGFunctions. Currently supports arbitrarily complex imperfect loop nests with statically constant (i.e., not affine map) bounds filled with operations. Does NOT support branches and non-constant loop bounds. Conversion is performed per-function and the function names are preserved to avoid breaking any external references to the current module. In-memory IR is updated to point to the right functions in direct calls and constant loads. This behavior is tested via a really hidden flag that enables function renaming. Inside each function, the control flow conversion is based on single-entry single-exit regions, i.e. subgraphs of the CFG that have exactly one incoming and exactly one outgoing edge. Since an MLFunction must have a single "return" statement as per MLIR spec, it constitutes an SESE region. Individual operations are appended to this region. Control flow statements are recursively converted into such regions that are concatenated with the current region. Bodies of the compound statement also form SESE regions, which allows to nest control flow statements easily. Note that SESE regions are not materialized in the code. It is sufficent to keep track of the end of the region as the current instruction insertion point as long as all recursive calls update the insertion point in the end. The converter maintains a mapping between SSA values in ML functions and their CFG counterparts. The mapping is used to find the operands for each operation and is updated to contain the results of each operation as the conversion continues. PiperOrigin-RevId: 221162602	2019-03-29 13:55:22 -07:00
Smit Hinsu	8946854128	Handle VectorOrTensorType parse failure instead of crashing This was unsafe after cr/219372163 and seems to be the only such case in the change. All other usage of dyn_cast are either handling the nullptr or are implicitly safe. For example, they are being extracted from operand or result SSAValue. TESTED with unit test PiperOrigin-RevId: 220905942	2019-03-29 13:54:10 -07:00
MLIR Team	b5424dd0cb	Adds support for returning the direction of the dependence between memref accesses (distance/direction vectors). Updates MemRefDependenceCheck to check and report on all memref access pairs at all loop nest depths. Updates old and adds new memref dependence check tests. Resolves multiple TODOs. PiperOrigin-RevId: 220816515	2019-03-29 13:53:28 -07:00
Uday Bondhugula	e0623d4b86	Automatic DMA generation for simple cases. - constant bounded memory regions, static shapes, no handling of overlapping/duplicate regions (through union) for now; also only, load memory op's. - add build methods for DmaStartOp, DmaWaitOp. - move getMemoryRegion() into Analysis/Utils and expose it. - fix addIndexSet, getMemoryRegion() post switch to exclusive upper bounds; update test cases for memref-bound-check and memref-dependence-check for exclusive bounds (missed in a previous CL) PiperOrigin-RevId: 220729810	2019-03-29 13:53:14 -07:00
Alex Zinenko	8e711246e4	Clean up VectorType construction. This CL introduces the following related changes: - factor out element type validity checking to a static member function VectorType::isValidElementType; - introduce get/getChecked similarly to MemRefType, where the checked function emits errors and returns nullptrs; - remove duplicate element type validity checking from the parser and rely on the type constructor to emit errors instead. PiperOrigin-RevId: 220693828	2019-03-29 13:52:46 -07:00
Uday Bondhugula	23ddd577ef	Complete migration to exclusive upper bound cl/220448963 had missed a part of the updates. - while on this, clean up some of the test cases to use ops' custom forms. PiperOrigin-RevId: 220675303	2019-03-29 13:52:17 -07:00
Alex Zinenko	846e48d16f	Allow vector types to have index elements. It is unclear why vector types were not allowed to have "index" as element type. Index values are integers, although of unknown bit width, and should behave as such. Vectors of integers are allowed and so are tensors of indices (for indirection purposes), it is more consistent to also have vectors of indices. PiperOrigin-RevId: 220630123	2019-03-29 13:51:33 -07:00
Alex Zinenko	ac2a655e87	Enable arithmetics for index types. Arithmetic and comparison instructions are necessary to implement, e.g., control flow when lowering MLFunctions to CFGFunctions. (While it is possible to replace some of the arithmetics by affine_apply instructions for loop bounds, it is still necessary for loop bounds checking, steps, if-conditions, non-trivial memref subscripts, etc.) Furthermore, working with indirect accesses in, e.g., lookup tables for large embeddings, may require operating on tensors of indexes. For example, the equivalents to C code "LUT[Index[i]]" or "ResultIndex[i] = i + j" where i, j are loop induction variables require the arithmetics on indices as well as the possibility to operate on tensors thereof. Allow arithmetic and comparison operations to apply to index types by declaring them integer-like. Allow tensors whose element type is index for indirection purposes. The absence of vectors with "index" element type is explicitly tested, but the only justification for this restriction in the CL introducing the test is "because we don't need them". Do NOT enable vectors of index types, although it makes vector and tensor types inconsistent with respect to allowed element types. PiperOrigin-RevId: 220614055	2019-03-29 13:51:19 -07:00
Alex Zinenko	3a38a5d0d6	Introduce integer comparison operation. This binary operation is applicable to integers, vectors and tensors thereof similarly to binary arithmetic operations. The operand types must match exactly, and the shape of the result type is the same as that of the operands. The element type of the result is always i1. The kind of the comparison is defined by the "predicate" integer attribute. This attribute requests one of: - equals to; - not equals to; - signed less than; - signed less than or equals; - signed greater than; - signed greater than or equals; - unsigned less than; - unsigned less than or equals; - unsigned greater than; - unsigned greater than or equals. Since integer values themselves do not have a sign, the comparison operator specifies whether to use signed or unsigned comparison logic, i.e. whether to interpret values where the foremost bit is set as negatives expressed as two's complements or as positive values. For non-scalar operands, pairwise per-element comparison is performed. Comparison operators on scalars are necessary to implement basic control flow with conditional branches. PiperOrigin-RevId: 220613566	2019-03-29 13:50:49 -07:00
Nicolas Vasilache	cde8248753	[MLIR] Make upper bound implementation exclusive This CL implement exclusive upper bound behavior as per b/116854378. A followup CL will update the semantics of the for loop. PiperOrigin-RevId: 220448963	2019-03-29 13:49:49 -07:00
Uday Bondhugula	6cd5d5c544	Introduce loop tiling code generation (hyper-rectangular case) - simple perfectly nested band tiling with fixed tile sizes. - only the hyper-rectangular case is handled, with other limitations of getIndexSet applying (constant loop bounds, etc.); once the latter utility is extended, tiled code generation should become more general. - Add FlatAffineConstraints::isHyperRectangular() PiperOrigin-RevId: 220324933	2019-03-29 13:49:05 -07:00
MLIR Team	239e328913	Adds MemRefDependenceCheck analysis pass, plus multiple dependence check tests. Adds equality constraints to dependence constraint system for accesses using dims/symbols where the defining operation of the dim/symbol is a constant. PiperOrigin-RevId: 219814740	2019-03-29 13:48:05 -07:00
Alex Zinenko	4aeb0a872c	Uniformize MemRefType well-formedness checks. Introduce a new public static member function, MemRefType::getChecked, intended for the users that want detailed error messages to be emitted during MemRefType construction and can gracefully handle these errors. This function takes a Location of the "MemRef" token if known. The parser is one user of getChecked that has location information, it outputs errors as compiler diagnostics. Other users may pass in an instance of UnknownLoc and still have error messages emitted. Compiler-internal users not expecting the MemRefType construction to fail should call MemRefType::get, which now aborts on failure with a generic message. Both "getChecked" and "get" call to a static free function that does actual construction with well-formedness checks, optionally emits errors and returns nullptr on failure. The location information passed to getChecked has voluntarily coarse precision. The error messages are intended for compiler engineers and do not justify heavier API than a single location. The text of the messages can be written so that it pinpoints the actual location of the error within a MemRef declaration. PiperOrigin-RevId: 219765902	2019-03-29 13:47:49 -07:00
Uday Bondhugula	74c62c8ce0	Complete memref bound checker for arbitrary affine expressions. Handle local variables from mod's and div's when converting to flat form. - propagate mod, floordiv, ceildiv / local variables constraint information when flattening affine expressions and converting them into flat affine constraints; resolve multiple TODOs. - enables memref bound checker to work with arbitrary affine expressions - update FlatAffineConstraints API with several new methods - test/exercise functionality mostly through -memref-bound-check - other analyses such as dependence tests, etc. should now be able to work in the presence of any affine composition of add, mul, floor, ceil, mod. PiperOrigin-RevId: 219711806	2019-03-29 13:47:29 -07:00
MLIR Team	f28e4df666	Adds a dependence check to test whether two accesses to the same memref access the same element. - Builds access functions and iterations domains for each access. - Builds dependence polyhedron constraint system which has equality constraints for equated access functions and inequality constraints for iteration domain loop bounds. - Runs elimination on the dependence polyhedron to test if no dependence exists between the accesses. - Adds a trivial LoopFusion transformation pass with a simple test policy to test dependence between accesses to the same memref in adjacent loops. - The LoopFusion pass will be extended in subsequent CLs. PiperOrigin-RevId: 219630898	2019-03-29 13:47:13 -07:00
Nicolas Vasilache	21638dcda9	[MLIR] Extend vectorization to 2+-D patterns This CL adds support for vectorization using more interesting 2-D and 3-D patterns. Note in particular the fact that we match some pretty complex imperfectly nested 2-D patterns with a quite minimal change to the implementation: we just add a bit of recursion to traverse the matched patterns and actually vectorize the loops. For instance, vectorizing the following loop by 128: ``` for %i3 = 0 to %0 { %7 = affine_apply (d0) -> (d0)(%i3) %8 = load %arg0[%c0_0, %7] : memref<?x?xf32> } ``` Currently generates: ``` #map0 = ()[s0] -> (s0 + 127) #map1 = (d0) -> (d0) for %i3 = 0 to #map0()[%0] step 128 { %9 = affine_apply #map1(%i3) %10 = alloc() : memref<1xvector<128xf32>> %11 = "n_d_unaligned_load"(%arg0, %c0_0, %9, %10, %c0) : (memref<?x?xf32>, index, index, memref<1xvector<128xf32>>, index) -> (memref<?x?xf32>, index, index, memref<1xvector<128xf32>>, index) %12 = load %10[%c0] : memref<1xvector<128xf32>> } ``` The above is subject to evolution. PiperOrigin-RevId: 219629745	2019-03-29 13:46:58 -07:00
Uday Bondhugula	8201e19e3d	Introduce memref bound checking. Introduce analysis to check memref accesses (in MLFunctions) for out of bound ones. It works as follows: $ mlir-opt -memref-bound-check test/Transforms/memref-bound-check.mlir /tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#2 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#2 %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32> ^ /tmp/single.mlir:12:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1 %y = load %B[%idy] : memref<128 x i32> ^ /tmp/single.mlir:12:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1 %y = load %B[%idy] : memref<128 x i32> ^ #map0 = (d0, d1) -> (d0, d1) #map1 = (d0, d1) -> (d0 * 128 - d1) mlfunc @test() { %0 = alloc() : memref<9x9xi32> %1 = alloc() : memref<128xi32> for %i0 = -1 to 9 { for %i1 = -1 to 9 { %2 = affine_apply #map0(%i0, %i1) %3 = load %0[%2tensorflow/mlir#0, %2tensorflow/mlir#1] : memref<9x9xi32> %4 = affine_apply #map1(%i0, %i1) %5 = load %1[%4] : memref<128xi32> } } return } - Improves productivity while manually / semi-automatically developing MLIR for testing / prototyping; also provides an indirect way to catch errors in transformations. - This pass is an easy way to test the underlying affine analysis machinery including low level routines. Some code (in getMemoryRegion()) borrowed from @andydavis cl/218263256. While on this: - create mlir/Analysis/Passes.h; move Pass.h up from mlir/Transforms/ to mlir/ - fix a bug in AffineAnalysis.cpp::toAffineExpr TODO: extend to non-constant loop bounds (straightforward). Will transparently work for all accesses once floordiv, mod, ceildiv are supported in the AffineMap -> FlatAffineConstraints conversion. PiperOrigin-RevId: 219397961	2019-03-29 13:46:08 -07:00
Nicolas Vasilache	af7f56fdf8	[MLIR] Implement 1-D vectorization for fastest varying load/stores This CL is a first in a series that implements early vectorization of increasingly complex patterns. In particular, early vectorization will support arbitrary loop nesting patterns (both perfectly and imperfectly nested), at arbitrary depths in the loop tree. This first CL builds the minimal support for applying 1-D patterns. It relies on an unaligned load/store op abstraction that can be inplemented differently on different HW. Future CLs will support higher dimensional patterns, but 1-D patterns already exhibit interesting properties. In particular, we want to separate pattern matching (i.e. legality both structural and dependency analysis based), from profitability analysis, from application of the transformation. As a consequence patterns may intersect and we need to verify that a pattern can still apply by the time we get to applying it. A non-greedy analysis on profitability that takes into account pattern intersection is left for future work. Additionally the CL makes the following cleanups: 1. the matches method now returns a value, not a reference; 2. added comments about the MLFunctionMatcher and MLFunctionMatches usage by value; 3. added size and empty methods to matches; 4. added a negative vectorization test with a conditional, this exhibited a but in the iterators. Iterators now return nullptr if the underlying storage is nullpt. PiperOrigin-RevId: 219299489	2019-03-29 13:44:26 -07:00
Alex Zinenko	19f14b72bb	Drop unbounded identity map from MemRef affine map composition. Unbounded identity maps do not affect the accesses through MemRefs in any way. A previous CL dropped such maps only if they were alone in the composition. Go further and drop such maps everywhere they appear in the composition. Update the parser test to check for unique'd hoisted map to be present but without assuming any particular order. Because some of the hoisted identity maps still apear due to the nested "for" statements, we need to check for them. However, they no longer appear above the non-identity maps because they are no longer necessary for the extfunc memref declarations that are textually first in the test file. This order may change further as map simplification is improved, there is no reason to assume a particular order. PiperOrigin-RevId: 219287280	2019-03-29 13:44:13 -07:00
Lei Zhang	582b0761c6	Use matcher sugars for cannonicalization pattern matching - Added a mechanism for specifying pattern matching more concisely like LLVM. - Added support for canonicalization of addi/muli over vector/tensor splat - Added ValueType to Attribute class hierarchy - Allowed creating constant splat PiperOrigin-RevId: 219149621	2019-03-29 13:43:44 -07:00
Uday Bondhugula	1ec77cecf2	FourierMotzkinEliminate trivial bug fix PiperOrigin-RevId: 219148982	2019-03-29 13:43:30 -07:00
Lei Zhang	60b5184c8b	Canonicalize muli(x, 1) into x PiperOrigin-RevId: 218885877	2019-03-29 13:42:01 -07:00
Alex Zinenko	aae372ecb8	Drop trivial identity affine mappings in MemRef construction. As per MLIR spec, the absence of affine maps in MemRef type is interpreted as an implicit identity affine map. Therefore, MemRef types declared with explicit or implicit identity map should be considered equal at the MemRefType level. During MemRefType construction, drop trivial identity affine map compositions. A trivial identity composition consists of a single unbounded identity map. It is unclear whether affine maps should be composed in-place to a single map during MemRef type construction, so non-trivial compositions that could have been simplified to an identity are NOT removed. We chose to drop the trivial identity map rather than inject it in places that assume its present implicitly because it makes the code simpler by reducing boilerplate; identity mappings are obvious defaults. Update tests that were checking for the presence of trivial identity map compositions in the outputs. PiperOrigin-RevId: 218862454	2019-03-29 13:41:47 -07:00
Alex Zinenko	87c5145a5d	Perform the MemRef layout map dimensionality check in the Parser. This check was being performed in AllocOp::verify. However it is not specific to AllocOp and should apply to all MemRef type declarations. At the same time, the unique *Type factory functions in MLIRContext do not have access to location information necessary to properly emit diagnostics. Emit the error in Parser where the location information is available. Keep the error emission in AllocOp for the cases of programmatically-constructed, e.g. through Builders, IR with a note. Once we decided on the diagnostic infrastructure in type construction system, the type-related checks should be removed from specific Ops. Correct several parser test cases that have been using affine maps of mismatching dimensionality. This CL prepares for an upcoming change that will drop trivial identity affine map compositions during MemRefType construction. In that case, the dimensionality mismatch error must be emitted before dropping the identity map, i.e. during the type construction at the latest and before "verify" being called. PiperOrigin-RevId: 218844127	2019-03-29 13:41:33 -07:00
Uday Bondhugula	ea65c695b9	Introduce integer set attribute - add IntegerSetAttr to Attributes; add parsing and other support for it (builder, etc.). PiperOrigin-RevId: 218804579	2019-03-29 13:40:50 -07:00
Chris Lattner	967d934180	Fix two issues: 1) We incorrectly reassociated non-reassociative operations like subi, causing miscompilations. 2) When constant folding, we didn't add users of the new constant back to the worklist for reprocessing, causing us to miss some cases (pointed out by Uday). The code for tensorflow/mlir#2 is gross, but I'll add the new APIs in a followup patch. PiperOrigin-RevId: 218803984	2019-03-29 13:40:35 -07:00
Uday Bondhugula	988ce3387f	Change sigil for integer set: @@ -> # PiperOrigin-RevId: 218786684	2019-03-29 13:40:21 -07:00
MLIR Team	13f6cc0187	Run GCD test before elimination. Adds test case with rational solutions, but no integer solutions. PiperOrigin-RevId: 218772332	2019-03-29 13:39:34 -07:00
Uday Bondhugula	80610c2f49	Introduce Fourier-Motzkin variable elimination + other cleanup/support - Introduce Fourier-Motzkin variable elimination to eliminate a dimension from a system of linear equalities/inequalities. Update isEmpty to use this. Since FM is only exact on rational/real spaces, an emptiness check based on this is guaranteed to be exact whenever it says the underlying set is empty; if it says, it's not empty, there may still be no integer points in it. Also, supports a version that computes "dark shadows". - Test this by checking for "always false" conditionals in if statements. - Unique IntegerSet's that are small (few constraints, few variables). This basically means the canonical empty set and other small sets that are likely commonly used get uniqued; allows checking for the canonical empty set by pointer. IntegerSet::kUniquingThreshold gives the threshold constraint size for uniqui'ing. - rename simplify-affine-expr -> simplify-affine-structures Other cleanup - IntegerSet::numConstraints, AffineMap::numResults are no longer needed; remove them. - add copy assignment operators for AffineMap, IntegerSet. - rename Invalid() -> Null() on AffineExpr, AffineMap, IntegerSet - Misc cleanup for FlatAffineConstraints API PiperOrigin-RevId: 218690456	2019-03-29 13:38:24 -07:00
MLIR Team	5413239350	Adds Gaussian Elimination to FlatAffineConstraints. - Adds FlatAffineConstraints::isEmpty method to test if there are no solutions to the system. - Adds GCD test check if equality constraints have no solution. - Adds unit test cases. PiperOrigin-RevId: 218546319	2019-03-29 13:38:10 -07:00
Alex Zinenko	e8d254b909	Rename shape_cast to tensor_cast. "shape_cast" only applies to tensors, and there are other operations that actually affect shape, for example "reshape". Rename "shape_cast" to "tensor_cast" in both the code and the documentation. PiperOrigin-RevId: 218528122	2019-03-29 13:37:41 -07:00
Chris Lattner	bd01f9541f	Teach canonicalize pass to unique and hoist constants to the entry block. This is a straight-forward change, but required adding missing moveBefore() methods on operations (requiring moving some traits around to make C++ happy). This also fixes a constness issue with the getBlock/getFunction() methods on Instruction, and adds a missing getFunction() method on MLFuncBuilder. PiperOrigin-RevId: 218523905	2019-03-29 13:36:59 -07:00
Feng Liu	3d7ab2d265	Add support to opaque elements attributes For some of the constant vector / tesor, if the compiler doesn't need to interpret their elements content, they can be stored in this class to save the serialize / deserialize cost. syntax: `opaque<` tensor-type `,` opaque-string `>` opaque-string ::= `0x` [0-9a-fA-F]* PiperOrigin-RevId: 218399426	2019-03-29 13:36:45 -07:00
Chris Lattner	301f83f906	Implement shape folding in the canonicalization pass: - Add a few canonicalization patterns to fold memref_cast into load/store/dealloc. - Canonicalize alloc(constant) into an alloc with a constant shape followed by a cast. - Add a new PatternRewriter::updatedRootInPlace API to make this more convenient. SimplifyAllocConst and the testcase is heavily based on Uday's implementation work, just in a different framework. PiperOrigin-RevId: 218361237	2019-03-29 13:36:31 -07:00
Alex Zinenko	991adadccb	Move the ReturnOp type checks to ReturnOp::verify. This was left as a TODO in the code. Move the type verification from MLFuncVerifier::verifyReturn to ReturnOp::verify. Since the return operation can only appear as the last statement of an MLFunction, i.e. where the surrounding block is the function itself, it is easy to access the function descriptor (ReturnOp::verify already relies on this). From the function descriptor, one can easily access the type information. Note that this slightly modifies the error message due to the use of emitOpError instead of a plain emitError. Drop the obsolete TODO comment in MLFunction::verify about checking that "return" only appears as the last operation of an MLFunction since ReturnOp::verify explicitly checks for that. PiperOrigin-RevId: 218347843	2019-03-29 13:36:17 -07:00
Alex Zinenko	d58ffaffe0	Verify that the first block of a cfgfunc does not have predecessors. This was left as a TODO in the code. Note that the spec does not explicitly prohibit the first basic block from having a predecessor, and may be worth updating. The error is reported at the location of the cfgfunc to which the basic block belongs since the location information of the block label is not propagated beyond the IR parser. Arguably, pointing to a function that starts with an ill-formed block is better than pointing to the first operation in that block as it makes easier to follow the code down until the first block label. PiperOrigin-RevId: 218343654	2019-03-29 13:36:01 -07:00
Chris Lattner	a03051b9c4	Add a pattern (x+0) -> x, generalize Canonicalize to CFGFunc's, address a few TODOs, and add some casting support to Operation. PiperOrigin-RevId: 218219340	2019-03-29 13:35:33 -07:00
Chris Lattner	b2f93b27ee	introduce a memref_cast operation, refactoring common code between it and shape_cast into a common CastOp class. PiperOrigin-RevId: 218175818	2019-03-29 13:35:06 -07:00
Chris Lattner	7850258c49	Introduce a new Operation::erase helper to generalize some code in the pattern matcher / canonicalizer, and rename existing eraseFromBlock methods to align with it. PiperOrigin-RevId: 218104455	2019-03-29 13:34:51 -07:00
Uday Bondhugula	a55b2c2eb6	Fix AffineExpr printing bug: paren ellision b/117887365. PiperOrigin-RevId: 217803621	2019-03-29 13:33:10 -07:00
Feng Liu	03b48999b6	Add support to constant sparse tensor / vector attribute The SparseElementsAttr uses (COO) Coordinate List encoding to represents a sparse tensor / vector. Specifically, the coordinates and values are stored as two dense elements attributes. The first dense elements attribute is a 2-D attribute with shape [N, ndims], which contains the indices of the elements with nonzero values in the constant vector/tensor. The second elements attribute is a 1-D attribute list with shape [N], which supplies the values for each element in the first elements attribute. ndims is the rank of the vector/tensor and N is the total nonzero elements. The syntax is: `sparse<` (tensor-type \| vector-type)`, ` indices-attribute-list, values-attribute-list `>` Example: a sparse tensor sparse<vector<3x4xi32>, [[0, 0], [1, 2]], [1, 2]> represents the dense tensor [[1, 0, 0, 0] [0, 0, 2, 0] [0, 0, 0, 0]] PiperOrigin-RevId: 217764319	2019-03-29 13:32:55 -07:00
Feng Liu	b5b90e5465	Add support to constant dense vector/tensor attribute. The syntax of dense vecor/tensor attribute value is `dense<` (tensor-type \| vector-type)`,` attribute-list`>` and attribute-list ::= `[` attribute-list (`, ` attribute-list)* `]`. The construction of the dense vector/tensor attribute takes a vector/tensor type and a character array as arguments. The size of the input array should be larger than the size specified by the type argument. It also assumes the elements of the vector or tensor have been trunked to the data type sizes in the input character array, so it extends the trunked data to 64 bits when it is retrieved. PiperOrigin-RevId: 217762811	2019-03-29 13:32:41 -07:00
Uday Bondhugula	18e666702c	Generalize / improve DMA transfer overlap; nested and multiple DMA support; resolve multiple TODOs. - replace the fake test pass (that worked on just the first loop in the MLFunction) to perform DMA pipelining on all suitable loops. - nested DMAs work now (DMAs in an outer loop, more DMAs in nested inner loops) - fix bugs / assumptions: correctly copy memory space and elemental type of source memref for double buffering. - correctly identify matching start/finish statements, handle multiple DMAs per loop. - introduce dominates/properlyDominates utitilies for MLFunction statements. - move checkDominancePreservationOnShifts to LoopAnalysis.h; rename it getShiftValidity - refactor getContainingStmtPos -> findAncestorStmtInBlock - move into Analysis/Utils.h; has two users. - other improvements / cleanup for related API/utilities - add size argument to dma_wait - for nested DMAs or in general, it makes it easy to obtain the size to use when lowering the dma_wait since we wouldn't want to identify the matching dma_start, and more importantly, in general/in the future, there may not always be a dma_start dominating the dma_wait. - add debug information in the pass PiperOrigin-RevId: 217734892	2019-03-29 13:32:28 -07:00
Nicolas Vasilache	3013dadb7c	[MLIR] Basic infrastructure for vectorization test This CL implements a very simple loop vectorization test and the basic infrastructure to support it. The test simply consists in: 1. matching the loops in the MLFunction and all the Load/Store operations nested under the loop; 2. testing whether all the Load/Store are contiguous along the innermost memory dimension along that particular loop. If any reference is non-contiguous (i.e. the ForStmt SSAValue appears in the expression), then the loop is not-vectorizable. The simple test above can gradually be extended with more interesting behaviors to account for the fact that a layout permutation may exist that enables contiguity etc. All these will come in due time but it is worthwhile noting that the test already supports detection of outer-vetorizable loops. In implementing this test, I also added a recursive MLFunctionMatcher and some sugar that can capture patterns such as `auto gemmLike = Doall(Doall(Red(LoadStore())))` and allows iterating on the matched IR structures. For now it just uses in order traversal but post-order DFS will be useful in the future once IR rewrites start occuring. One may note that the memory management design decision follows a different pattern from MLIR. After evaluating different designs and how they quickly increase cognitive overhead, I decided to opt for the simplest solution in my view: a class-wide (threadsafe) RAII context. This way, a pass that needs MLFunctionMatcher can just have its own locally scoped BumpPtrAllocator and everything is cleaned up when the pass is destroyed. If passes are expected to have a longer lifetime, then the contexts can easily be scoped inside the runOnMLFunction call and storage lifetime reduced. Lastly, whatever the scope of threading (module, function, pass), this is expected to also be future-proof wrt concurrency (but this is a detail atm). PiperOrigin-RevId: 217622889	2019-03-29 13:32:13 -07:00
Chris Lattner	80e884a9f8	Add constant folding and binary operator reassociation to the canonicalize pass, build up the worklist infra in anticipation of improving the pattern matcher to match more than one node. PiperOrigin-RevId: 217330579	2019-03-29 13:31:44 -07:00
MLIR Team	0114e232d8	Adds method to AffineApplyOp which forward substitutes its results into any of its users which are also AffineApplyOps. Updates ComposeAffineMaps test pass to use this method. Updates affine map composition test cases to handle the new pass, which can be reused when this method is used in a future instruction combine pass. PiperOrigin-RevId: 217163351	2019-03-29 13:30:49 -07:00
Jacques Pienaar	826f5c1c04	Avoid leak when parsing fails and BasicBlock has no use/function. Associate BasicBlocks with the function being parsed to avoid leaks in the case of parse failures. Associating with the function means that we can no longer determine if defined/fwd declared simply by considering if a BasicBlock has an associated function, so track forward declared block references explicitly (this should also allow flagging multiple undeclared fwd references). Split out getting the named block from defining it, in the case of definition move the block to the end of the function. Also destroy all forward reference placeholders in FunctionParser. Return parse failure in parseAttributeDict if there is no left brace instead of asserting. PiperOrigin-RevId: 217049507	2019-03-29 13:30:06 -07:00
Uday Bondhugula	86eac4618c	Create private exclusive / single use affine computation slice for an op stmt. - add util to create a private / exclusive / single use affine computation slice for an op stmt (see method doc comment); a single multi-result affine_apply op is prepended to the op stmt to provide all results needed for its operands as a function of loop iterators and symbols. - use it for DMA pipelining (to create private slices for DMA start stmt's); resolve TODOs/feature request (b/117159533) - move createComposedAffineApplyOp to Transforms/Utils; free it from taking a memref as input / generalize it. PiperOrigin-RevId: 216926818	2019-03-29 13:29:21 -07:00
Chris Lattner	9e3b928e32	Implement a super sketched out pattern match/rewrite framework and a sketched out canonicalization pass to drive it, and a simple (x-x) === 0 pattern match as a test case. There is a tremendous number of improvements that need to land, and the matcher/rewriter and patterns will be split out of this file, but this is a starting point. PiperOrigin-RevId: 216788604	2019-03-29 13:29:07 -07:00
Feng Liu	5e3cca906a	Add support to constant splat vector/tensor attribute. This attribute represents a reference to a splat vector or tensor, where all the elements have the same value. The syntax of the attribute is: `splat<` (tensor-type \| vector-type)`,` attribute-value `>` PiperOrigin-RevId: 216537997	2019-03-29 13:27:05 -07:00
Uday Bondhugula	82e55750d2	Add target independent standard DMA ops: dma.start, dma.wait Add target independent standard DMA ops: dma.start, dma.wait. Update pipeline data transfer to use these to detect DMA ops. While on this - return failure from mlir-opt::performActions if a pass generates invalid output - improve error message for verify 'n' operand traits PiperOrigin-RevId: 216429885	2019-03-29 13:26:10 -07:00
Jacques Pienaar	2df03be621	Fix some leak and crash found via fuzzing. Tried adding a fuzzer target (cl/216378253) and ran into a few problems, and fixing two of these. PiperOrigin-RevId: 216425403	2019-03-29 13:25:56 -07:00
MLIR Team	fe490043b0	Affine map composition. ) Implements AffineValueMap forward substitution for AffineApplyOps. ) Adds ComposeAffineMaps transformation pass, which composes affine maps for all loads/stores in an MLFunction. *) Adds multiple affine map composition tests. PiperOrigin-RevId: 216216446	2019-03-29 13:24:59 -07:00
Chris Lattner	d2d89cbc19	Rename affineint type to index type. The name 'index' may not be perfect, but is better than the old name. Here is some justification: 1) affineint (as it is named) is not a type suitable for general computation (e.g. the multiply/adds in an integer matmul). It has undefined width and is undefined on overflow. They are used as the indices for forstmt because they are intended to be used as indexes inside the loop. 2) It can be used in both cfg and ml functions, and in cfg functions. As you mention, “symbols” are not affine, and we use affineint values for symbols. 3) Integers aren’t affine, the algorithms applied to them can be. :) 4) The only suitable use for affineint in MLIR is for indexes and dimension sizes (i.e. the bounds of those indexes). PiperOrigin-RevId: 216057974	2019-03-29 13:24:16 -07:00
Uday Bondhugula	d18ae9e2c7	Constant folding for loop bounds. - Fold the lower/upper bound of a loop to a constant whenever the result of the application of the bound's affine map on the operand list yields a constant. - Update/complete 'for' stmt's API to set lower/upper bounds with operands. Resolve TODOs for ForStmt::set{Lower,Upper}Bound. - Moved AffineExprConstantFolder into AffineMap.cpp and added AffineMap::constantFold to be used by both AffineApplyOp and ForStmt::constantFoldBound. PiperOrigin-RevId: 215997346	2019-03-29 13:24:01 -07:00
Chris Lattner	6822c4e29c	Implement support for constant folding operations even when their operands are not all constant. Implement support for folding dim, x*0, and affine_apply. PiperOrigin-RevId: 215917432	2019-03-29 13:23:32 -07:00
Uday Bondhugula	6cfdb756b1	Introduce memref replacement/rewrite support: to replace an existing memref with a new one (of a potentially different rank/shape) with an optional index remapping. - introduce Utils::replaceAllMemRefUsesWith - use this for DMA double buffering (This CL also adds a few temporary utilities / code that will be done away with once: 1) abstract DMA op's are added 2) memref deferencing side-effect / trait is available on op's 3) b/117159533 is resolved (memref index computation slices). PiperOrigin-RevId: 215831373	2019-03-29 13:23:19 -07:00
Uday Bondhugula	0ebc927f2f	Fix MLIR's floordiv, ceildiv, and mod for constant inputs (for negative lhs's) - introduce mlir::{floorDiv, ceilDiv, mod} for constant inputs in mlir/Support/MathExtras.h - consistently use these everywhere in IR, Analysis, and Transforms. PiperOrigin-RevId: 215580677	2019-03-29 13:21:53 -07:00
Feng Liu	7d016fd352	Add support to Add, Sub, Mul for both Integer and Float types. The new operations are registered and also the const folding of them are implemented. PiperOrigin-RevId: 215575999	2019-03-29 13:21:40 -07:00
Uday Bondhugula	041817a45e	Introduce loop body skewing / loop pipelining / loop shifting utility. - loopBodySkew shifts statements of a loop body by stmt-wise delays, and is typically meant to be used to: - allow overlap of non-blocking start/wait until completion operations with other computation - allow shifting of statements (for better register reuse/locality/parallelism) - software pipelining (when applied to the innermost loop) - an additional argument specifies whether to unroll the prologue and epilogue. - add method to check SSA dominance preservation. - add a fake loop pipeline pass to test this utility. Sample input/output are below. While on this, fix/add following: - fix minor bug in getAddMulPureAffineExpr - add additional builder methods for common affine map cases - fix const_operand_iterator's for ForStmt, etc. When there is no such thing as 'const MLValue', the iterator shouldn't be returning const MLValue's. Returning MLValue is const correct. Sample input/output examples: 1) Simplest case: shift second statement by one. Input: for %i = 0 to 7 { %y = "foo"(%i) : (affineint) -> affineint %x = "bar"(%i) : (affineint) -> affineint } Output: #map0 = (d0) -> (d0 - 1) mlfunc @loop_nest_simple1() { %c8 = constant 8 : affineint %c0 = constant 0 : affineint %0 = "foo"(%c0) : (affineint) -> affineint for %i0 = 1 to 7 { %1 = "foo"(%i0) : (affineint) -> affineint %2 = affine_apply #map0(%i0) %3 = "bar"(%2) : (affineint) -> affineint } %4 = affine_apply #map0(%c8) %5 = "bar"(%4) : (affineint) -> affineint return } 2) DMA overlap: shift dma.wait and compute by one. Input for %i = 0 to 7 { %pingpong = affine_apply (d0) -> (d0 mod 2) (%i) "dma.enqueue"(%pingpong) : (affineint) -> affineint %pongping = affine_apply (d0) -> (d0 mod 2) (%i) "dma.wait"(%pongping) : (affineint) -> affineint "compute1"(%pongping) : (affineint) -> affineint } Output #map0 = (d0) -> (d0 mod 2) #map1 = (d0) -> (d0 - 1) #map2 = ()[s0] -> (s0 + 7) mlfunc @loop_nest_dma() { %c8 = constant 8 : affineint %c0 = constant 0 : affineint %0 = affine_apply #map0(%c0) %1 = "dma.enqueue"(%0) : (affineint) -> affineint for %i0 = 1 to 7 { %2 = affine_apply #map0(%i0) %3 = "dma.enqueue"(%2) : (affineint) -> affineint %4 = affine_apply #map1(%i0) %5 = affine_apply #map0(%4) %6 = "dma.wait"(%5) : (affineint) -> affineint %7 = "compute1"(%5) : (affineint) -> affineint } %8 = affine_apply #map1(%c8) %9 = affine_apply #map0(%8) %10 = "dma.wait"(%9) : (affineint) -> affineint %11 = "compute1"(%9) : (affineint) -> affineint return } 3) With arbitrary affine bound maps: Shift last two statements by two. Input: for %i = %N to ()[s0] -> (s0 + 7)()[%N] { %y = "foo"(%i) : (affineint) -> affineint %x = "bar"(%i) : (affineint) -> affineint %z = "foo_bar"(%i) : (affineint) -> (affineint) "bar_foo"(%i) : (affineint) -> (affineint) } Output #map0 = ()[s0] -> (s0 + 1) #map1 = ()[s0] -> (s0 + 2) #map2 = ()[s0] -> (s0 + 7) #map3 = (d0) -> (d0 - 2) #map4 = ()[s0] -> (s0 + 8) #map5 = ()[s0] -> (s0 + 9) for %i0 = %arg0 to #map0()[%arg0] { %0 = "foo"(%i0) : (affineint) -> affineint %1 = "bar"(%i0) : (affineint) -> affineint } for %i1 = #map1()[%arg0] to #map2()[%arg0] { %2 = "foo"(%i1) : (affineint) -> affineint %3 = "bar"(%i1) : (affineint) -> affineint %4 = affine_apply #map3(%i1) %5 = "foo_bar"(%4) : (affineint) -> affineint %6 = "bar_foo"(%4) : (affineint) -> affineint } for %i2 = #map4()[%arg0] to #map5()[%arg0] { %7 = affine_apply #map3(%i2) %8 = "foo_bar"(%7) : (affineint) -> affineint %9 = "bar_foo"(%7) : (affineint) -> affineint } 4) Shift one by zero, second by one, third by two for %i = 0 to 7 { %y = "foo"(%i) : (affineint) -> affineint %x = "bar"(%i) : (affineint) -> affineint %z = "foobar"(%i) : (affineint) -> affineint } #map0 = (d0) -> (d0 - 1) #map1 = (d0) -> (d0 - 2) #map2 = ()[s0] -> (s0 + 7) %c9 = constant 9 : affineint %c8 = constant 8 : affineint %c1 = constant 1 : affineint %c0 = constant 0 : affineint %0 = "foo"(%c0) : (affineint) -> affineint %1 = "foo"(%c1) : (affineint) -> affineint %2 = affine_apply #map0(%c1) %3 = "bar"(%2) : (affineint) -> affineint for %i0 = 2 to 7 { %4 = "foo"(%i0) : (affineint) -> affineint %5 = affine_apply #map0(%i0) %6 = "bar"(%5) : (affineint) -> affineint %7 = affine_apply #map1(%i0) %8 = "foobar"(%7) : (affineint) -> affineint } %9 = affine_apply #map0(%c8) %10 = "bar"(%9) : (affineint) -> affineint %11 = affine_apply #map1(%c8) %12 = "foobar"(%11) : (affineint) -> affineint %13 = affine_apply #map1(%c9) %14 = "foobar"(%13) : (affineint) -> affineint 5) SSA dominance violated; no shifting if a shift is specified for the second statement. for %i = 0 to 7 { %x = "foo"(%i) : (affineint) -> affineint "bar"(%x) : (affineint) -> affineint } PiperOrigin-RevId: 214975731	2019-03-29 13:21:26 -07:00
Uday Bondhugula	ec35e51f6d	Change loop step to be a positive integral constant Changing this per discussion on mlir-team. Spec updated. PiperOrigin-RevId: 214868483	2019-03-29 13:21:13 -07:00
Uday Bondhugula	591fa9698e	Change behavior of loopUnrollFull with unroll factor 1 Using loopUnrollFull with unroll factor 1 should promote the loop body as opposed to doing nothing. PiperOrigin-RevId: 214812126	2019-03-29 13:20:59 -07:00
Chris Lattner	c706e0b1b5	Add support for expected-warning and expected-note markers in mlir-opt -verify mode. We even diagnose mistakes nicely (aside from the a/an vowel confusion which isn't worth worrying about): test/IR/invalid.mlir split at line tensorflow/mlir#399:8:34: error: 'note' diagnostic emitted when expecting a 'error' %x = "bar"() : () -> i32 // expected-error {{operand defined here}} ^ PiperOrigin-RevId: 214773208	2019-03-29 13:20:46 -07:00
Chris Lattner	c6e4aa9ba7	Fix b/116749799, an issue where the ZeroResult trait's verifier hook left in an old form. Upgrade it, and move all the trait verifier implementations consistently out of line to reduce template bloat. PiperOrigin-RevId: 214718242	2019-03-29 13:20:18 -07:00
Nicolas Vasilache	140672a2b8	[MLIR] Add DimOp build support This CL introduces basic support to build a DimOp as well as a standalone test. PiperOrigin-RevId: 214688910	2019-03-29 13:20:03 -07:00
Nicolas Vasilache	54e5b4b4c0	[MLIR] Fix AsmPrinter for short-hand bound notation This CL retricts shorthand notation printing to only the bounds that can be roundtripped unambiguously; i.e.: 1. ()[]->(%some_cst) ()[] 2. ()[s0]->(s0) ()[%some_symbol] Upon inspection it turns out that the constant case was lossy so this CL also updates it. Note however that fixing this issue exhibits a potential issues in unroll.mlir. L488 exhibits a map ()[s0] -> (1)()[%arg0] which could be simplified down to ()[]->(1)()[]. This does not seem like a bug but maybe an undesired complexity in the maps generated by unrolling. bondhugula@, care to take a look? PiperOrigin-RevId: 214531410	2019-03-29 13:19:04 -07:00
Nicolas Vasilache	0f7fddfd65	[MLIR] Add support for MulFOp This CL adds support for `mulf` which is necessary to write/emit a simple scalar matmul in MLIR. This CL does not consider automation of generation of ops but mulf is important and useful enough to be added on its own atm. PiperOrigin-RevId: 214496098	2019-03-29 13:18:49 -07:00
MLIR Team	99188b9d98	Adds constant folding hook for AffineApplyOp. PiperOrigin-RevId: 214287780	2019-03-29 13:18:19 -07:00
Nicolas Vasilache	f9e50199e9	[MLIR] Fix AsmPrinter.cpp for single ssa-id AffineMap The AsmPrinter wrongly assumes that all single ssa-id AffineMap are the identity map for the purpose of printing. This CL adds the missing level of indirection as well as a test. This bug was originally shaken off by the experimental TC->MLIR path. Before this CL, the test would print: ``` mlfunc @mlfuncsimplemap(%arg0 : affineint, %arg1 : affineint, %arg2 : affineint) { for %i0 = 0 to %arg0 { for %i1 = 0 to %i0 { ~~~ should be %arg1 %c42_i32 = constant 42 : i32 } } return } ``` PiperOrigin-RevId: 214120817	2019-03-29 13:18:05 -07:00
Chris Lattner	d6f8ec7bac	Introduce [post]dominator tree and related infrastructure, use it in CFG func verifier. We get most of this infrastructure directly from LLVM, we just need to adapt it to our CFG abstraction. This has a few unrelated changes engangled in it: - getFunction() in various classes was const incorrect, fix it. - This moves Verifier.cpp to the analysis library, since Verifier depends on dominance and these are both really analyses. - IndexedAccessorIterator::reference was defined wrong, leading to really exciting template errors that were fun to diagnose. - This flips the boolean sense of the foldOperation() function in constant folding pass in response to previous patch feedback. PiperOrigin-RevId: 214046593	2019-03-29 13:17:20 -07:00
MLIR Team	aa0309d704	Add verification for AllocOp. PiperOrigin-RevId: 213829386	2019-03-29 13:16:47 -07:00
Chris Lattner	82eb284a53	Implement support for constant folding operations and a simple constant folding optimization pass: - Give the ability for operations to implement a constantFold hook (a simple one for single-result ops as well as general support for multi-result ops). - Implement folding support for constant and addf. - Implement support in AbstractOperation and Operation to make this usable by clients. - Implement a very simple constant folding pass that does top down folding on CFG and ML functions, with a testcase that exercises all the above stuff. Random cleanups: - Improve the build APIs for ConstantOp. - Stop passing "-o -" to mlir-opt in the testsuite, since that is the default. PiperOrigin-RevId: 213749809	2019-03-29 13:16:33 -07:00
Chris Lattner	14ca1be9a7	Add missing verifier logic for addf, and fix b/116054838 - Parser crash handling alloc with no affine mappings. PiperOrigin-RevId: 213639056	2019-03-29 13:15:48 -07:00
Feng Liu	7e004efae2	Add function attributes for ExtFunction, CFGFunction and MLFunction. PiperOrigin-RevId: 213540509	2019-03-29 13:15:35 -07:00
Jacques Pienaar	81a066e6e7	Switch from positional argument to explicit flags for mlir-translate This results in uniform behavior with mlir-opt. Exactly one transformation is allowed. PiperOrigin-RevId: 213493415	2019-03-29 13:15:22 -07:00
Uday Bondhugula	ab4797229c	Extend loop unroll/unroll-and-jam to affine bounds + refactor related code. - extend loop unroll-jam similar to loop unroll for affine bounds - extend both loop unroll/unroll-jam to deal with cleanup loop for non multiple of unroll factor. - extend promotion of single iteration loops to work with affine bounds - fix typo bugs in loop unroll - refactor common code b/w loop unroll and loop unroll-jam - move prototypes of non-pass transforms to LoopUtils.h - add additional builder methods. - introduce loopUnrollUpTo(factor) to unroll by either factor or trip count, whichever is less. - remove Statement::isInnermost (not used for now - will come back at the right place/in right form later) PiperOrigin-RevId: 213471227	2019-03-29 13:15:06 -07:00
Jacques Pienaar	47c7df0ed9	Tool for translating from/to MLIR. mlir-translate is a tool to translate from/to MLIR. The translations are registered at link time and intended for use in tests. An identity transformation (mlir-to-mlir) is registered by default as example and used in the parser test where simply parsing & printing required. The TranslateFunctions take filenames (instead of MemoryBuffer) to allow translations special write behavior (e.g., writing to uncommon filesystems). PiperOrigin-RevId: 213370448	2019-03-29 13:14:37 -07:00
Chris Lattner	e1257e8978	Change unranked tensor syntax from tensor<??f32> to tensor<*xf32> per discussion on the list. PiperOrigin-RevId: 212838226	2019-03-29 13:13:42 -07:00
Chris Lattner	a21f2f453d	Introduce pretty syntax for shape_cast as discussed on the list last week. PiperOrigin-RevId: 212823681	2019-03-29 13:13:29 -07:00
Uday Bondhugula	64812a56c7	Extend getConstantTripCount to deal with a larger subset of loop bounds; make loop unroll/unroll-and-jam more powerful; add additional affine expr builder methods - use previously added analysis/simplification to infer multiple of unroll factor trip counts, making loop unroll/unroll-and-jam more general. - for loop unroll, support bounds that are single result affine map's with the same set of operands. For unknown loop bounds, loop unroll will now work as long as trip count can be determined to be a multiple of unroll factor. - extend getConstantTripCount to deal with single result affine map's with the same operands. move it to mlir/Analysis/LoopAnalysis.cpp - add additional builder utility methods for affine expr arithmetic (difference, mod/floordiv/ceildiv w.r.t postitive constant). simplify code to use the utility methods. - move affine analysis routines to AffineAnalysis.cpp/.h from AffineStructures.cpp/.h. - Rename LoopUnrollJam to LoopUnrollAndJam to match class name. - add an additional simplification for simplifyFloorDiv, simplifyCeilDiv - Rename AffineMap::getNumOperands() getNumInputs: an affine map by itself does not have operands. Operands are passed to it through affine_apply, from loop bounds/if condition's, etc., operands are stored in the latter. This should be sufficiently powerful for now as far as unroll/unroll-and-jam go for TPU code generation, and can move to other analyses/transformations. Loop nests like these are now unrolled without any cleanup loop being generated. for %i = 1 to 100 { // unroll factor 4: no cleanup loop will be generated. for %j = (d0) -> (d0) (%i) to (d0) -> (5*d0 + 3) (%i) { %x = "foo"(%j) : (affineint) -> i32 } } for %i = 1 to 100 { // unroll factor 4: no cleanup loop will be generated. for %j = (d0) -> (d0) (%i) to (d0) -> (d0 - d mod 4 - 1) (%i) { %y = "foo"(%j) : (affineint) -> i32 } } for %i = 1 to 100 { for %j = (d0) -> (d0) (%i) to (d0) -> (d0 + 128) (%i) { %x = "foo"() : () -> i32 } } TODO(bondhugula): extend this to LoopUnrollAndJam as well in the next CL (with minor changes). PiperOrigin-RevId: 212661212	2019-03-29 13:13:00 -07:00
Jacques Pienaar	8ad7e2b8fa	Update error message for invalid operand token while parsing operand list. Previously the error could mislead into thinking it was a parser bug instead of the input being erroneous. Update to make it clearer. PiperOrigin-RevId: 212271145	2019-03-29 13:12:31 -07:00
Jacques Pienaar	cf9aba2b2b	Check for absence of delimiters when delimiters is None and fixed number of operands expected. Ensure delimiters are absent where not expected. This is only checked in the case where operand count is known. This allows for the currently accepted case where there is a operand list with no delimiter and variable number of operands (which could be empty), followed by a delimited operand list. PiperOrigin-RevId: 212202064	2019-03-29 13:12:03 -07:00
Jacques Pienaar	d101fb937b	Return error status when number of operands don't match while parsing. Previously an error would be emitted but parsing would continue as false was being returned. PiperOrigin-RevId: 212192167	2019-03-29 13:11:49 -07:00
Uday Bondhugula	3bae041e5d	Add utility to promote single iteration loops. Add methods for getting constant loop counts. Improve / refactor loop unroll / loop unroll and jam. - add utility to remove single iteration loops. - use this utility to promote single iteration loops after unroll/unroll-and-jam - use loopUnrollByFactor for loopUnrollFull and remove most of the latter. - add methods for getting constant loop trip count PiperOrigin-RevId: 212039569	2019-03-29 13:11:21 -07:00
Chris Lattner	348f31a4fa	Add location specifier to MLIR Functions, and: - Compress the identifier/kind of a Function into a single word. - Eliminate otherFailure from verifier now that we always have a location - Eliminate the error string from the verifier now that we always have locations. - Simplify the parser's handling of fn forward references, using the location tracked by the function. PiperOrigin-RevId: 211985101	2019-03-29 13:10:55 -07:00
Chris Lattner	6337af082b	Improve location reporting in the verifier for return instructions and other terminators. Improve mlir-opt to print better location info in the split-files case. Before: error: unexpected error: branch has 2 operands, but target block has 1 br bb1(%0tensorflow/mlir#1, %0tensorflow/mlir#0 : i17, i1) ^ after: invalid.mlir split at line tensorflow/mlir#305:6:3: error: unexpected error: branch has 2 operands, but target block has 1 br bb1(%0tensorflow/mlir#1, %0tensorflow/mlir#0 : i17, i1) ^ It still isn't optimal (it would be better to have just the original file and line number but is a step forward, and doing the optimal thing would be a lot more complicated. PiperOrigin-RevId: 211917067	2019-03-29 13:10:38 -07:00
Uday Bondhugula	d5416f299e	Complete AffineExprFlattener based simplification for floordiv/ceildiv. - handle floordiv/ceildiv in AffineExprFlattener; update the simplification to work even if mod/floordiv/ceildiv expressions appearing in the tree can't be eliminated. - refactor the flattening / analysis to move it out of lib/Transforms/ - fix MutableAffineMap::isMultipleOf - add AffineBinaryOpExpr:getAdd/getMul/... utility methods PiperOrigin-RevId: 211540536	2019-03-29 13:09:18 -07:00
Chris Lattner	6dc2a34dcf	Continue revising diagnostic handling to simplify and generalize it, and improve related infra. - Add a new -verify mode to the mlir-opt tool that allows writing test cases for optimization and other passes that produce diagnostics. - Refactor existing the -check-parser-errors flag to mlir-opt into a new -split-input-file option which is orthogonal to -verify. - Eliminate the special error hook the parser maintained and use the standard MLIRContext's one instead. - Enhance the default MLIRContext error reporter to print file/line/col of errors when it is available. - Add new createChecked() methods to the builder that create ops and invoke the verify hook on them, use this to detected unhandled code in the RaiseControlFlow pass. - Teach mlir-opt about expected-error @+, it previously only worked with @- PiperOrigin-RevId: 211305770	2019-03-29 13:08:51 -07:00
Tatiana Shpeisman	cedc28483f	Fix asan failure introduced by cl/210618122 and statement walker crash for if statements without else clause. PiperOrigin-RevId: 211186361	2019-03-29 13:08:25 -07:00
Uday Bondhugula	0122a99cbb	Affine expression analysis and simplification. Outside of IR/ - simplify a MutableAffineMap by flattening the affine expressions - add a simplify affine expression pass that uses this analysis - update the FlatAffineConstraints API (to be used in the next CL) In IR: - add isMultipleOf and getKnownGCD for AffineExpr, and make the in-IR simplication of simplifyMod simpler and more powerful. - rename the AffineExpr visitor methods to distinguish b/w visiting and walking, and to simplify API names based on context. The next CL will use some of these for the loop unrolling/unroll-jam to make the detection for the need of cleanup loop powerful/non-trivial. A future CL will finally move this simplification to FlatAffineConstraints to make it more powerful. For eg., currently, even if a mod expr appearing in a part of the expression tree can't be simplified, the whole thing won't be simplified. PiperOrigin-RevId: 211012256	2019-03-29 13:07:44 -07:00
Uday Bondhugula	e9fb4b492d	Introduce loop unroll jam transformation. - for test purposes, the unroll-jam pass unroll jams the first outermost loop. While on this: - fix StmtVisitor to allow overriding of function to iterate walk over children of a stmt. PiperOrigin-RevId: 210644813	2019-03-29 13:07:30 -07:00
Tatiana Shpeisman	1a56ee7093	Implement operands for the 'if' statement. This CL also includes two other minor changes: - change the implemented syntax from 'if (cond)' to 'if cond', as specified by MLIR spec. - a minor fix to the implementation of the ForStmt. PiperOrigin-RevId: 210618122	2019-03-29 13:07:16 -07:00
Nicolas Vasilache	a124e9c4a5	Avoid hardcoded 4096 constant This commit creates a static constexpr limit for the IntegerType bitwidth and uses it. The check had to be moved because Token is not aware of IR/Type and it was a sign the abstraction leaked: bitwidth limit is not a property of the Token but of the IntegerType. Added a positive and a negative test at the limit. PiperOrigin-RevId: 210388192	2019-03-29 13:06:36 -07:00
Nicolas Vasilache	6d13c3b773	Add 2 extra MLIR affine tests This commit adds 2 tests: 1. a negative test in which the simplification of expression does not seem satisfactory. This test should be updated once expression simplification works reasonably. 2. a positive test in which floordiv and ceildiv return the same result, properly enforced with CHECK-NOT PiperOrigin-RevId: 210286267	2019-03-29 13:05:56 -07:00
Nicolas Vasilache	bd44fcb8ff	Fix confusing CHECK-EMPTY in affine-map test This commit replaces // CHECK-EMPTY because it is an extremely confusing way of allowing (but not checking for) empty lines. The problem is that // CHECK-EMPTY is only a comment and does not do anything. I originally tried to use // CHECK-EMPTY: but errors occured due to missing newlines. The intended behavior of the test is to enforce nothing (not even a newline) is printed and the proper way to check for this is to use CHECK-NOT. Thanks to @rxwei for helping me figure out to use CHECK-NOT properly. PiperOrigin-RevId: 210286262	2019-03-29 13:05:42 -07:00
Tatiana Shpeisman	d32a28c520	Implement operands for the lower and upper bounds of the for statement. This revamps implementation of the loop bounds in the ForStmt, using general representation that supports operands. The frequent case of constant bounds is supported via special access methods. This also includes: - Operand iterators for the Statement class. - OpPointer::is() method to query the class of the Operation. - Support for the bound shorthand notation parsing and printing. - Validity checks for the bound operands used as dim ids and symbols I didn't mean this CL to be so large. It just happened this way, as one thing led to another. PiperOrigin-RevId: 210204858	2019-03-29 13:05:16 -07:00
Chris Lattner	9de71b2aea	Introduce a new extract_element operation that does what it says. Introduce a new VectorOrTensorType class that provides a common interface between vector and tensor since a number of operations will be uniform across them (including extract_element). Improve the LoadOp verifier. I also updated the MLIR spec doc as well. PiperOrigin-RevId: 209953189	2019-03-29 13:04:19 -07:00
Chris Lattner	84259c7def	Implement call and call_indirect ops. This also fixes an infinite recursion in VariadicOperands that this turned up. PiperOrigin-RevId: 209692932	2019-03-29 13:03:51 -07:00
Uday Bondhugula	00bed4bd99	Extend loop unrolling to unroll by a given factor; add builder for affine apply op. - add builder for AffineApplyOp (first one for an operation that has non-zero operands) - add support for loop unrolling by a given factor; uses the affine apply op builder. While on this, change 'step' of ForStmt to be 'unsigned' instead of AffineConstantExpr *. Add setters for ForStmt lb, ub, step. Sample Input: // CHECK-LABEL: mlfunc @loop_nest_unroll_cleanup() { mlfunc @loop_nest_unroll_cleanup() { for %i = 1 to 100 { for %j = 0 to 17 { %x = "addi32"(%j, %j) : (affineint, affineint) -> i32 %y = "addi32"(%x, %x) : (i32, i32) -> i32 } } return } Output: $ mlir-opt -loop-unroll -unroll-factor=4 /tmp/single2.mlir #map0 = (d0) -> (d0 + 1) #map1 = (d0) -> (d0 + 2) #map2 = (d0) -> (d0 + 3) mlfunc @loop_nest_unroll_cleanup() { for %i0 = 1 to 100 { for %i1 = 0 to 17 step 4 { %0 = "addi32"(%i1, %i1) : (affineint, affineint) -> i32 %1 = "addi32"(%0, %0) : (i32, i32) -> i32 %2 = affine_apply #map0(%i1) %3 = "addi32"(%2, %2) : (affineint, affineint) -> i32 %4 = affine_apply #map1(%i1) %5 = "addi32"(%4, %4) : (affineint, affineint) -> i32 %6 = affine_apply #map2(%i1) %7 = "addi32"(%6, %6) : (affineint, affineint) -> i32 } for %i2 = 16 to 17 { %8 = "addi32"(%i2, %i2) : (affineint, affineint) -> i32 %9 = "addi32"(%8, %8) : (i32, i32) -> i32 } } return } PiperOrigin-RevId: 209676220	2019-03-29 13:03:38 -07:00
Uday Bondhugula	6911c24e97	Sketch out affine analysis structures: AffineValueMap, IntegerValueSet, FlatAffineConstraints, and MutableAffineMap. All four classes introduced reside in lib/Analysis and are not meant to be used in the IR (from lib/IR or lib/Parser/). They are all mutable, alloc'ed, dealloc'ed - although with their fields pointing to immutable affine expressions (AffineExpr *). While on this, update simplifyMod to fold mod to a zero when possible. PiperOrigin-RevId: 209618437	2019-03-29 13:03:24 -07:00
Chris Lattner	d9290db5fe	Finish support for function attributes, and improve lots of things: - Have the parser rewrite forward references to their resolved values at the end of parsing. - Implement verifier support for detecting malformed function attrs. - Add efficient query for (in general, recursive) attributes to tell if they contain a function. As part of this, improve other general infrastructure: - Implement support for verifying OperationStmt's in ml functions, refactoring and generalizing support for operations in the verifier. - Refactor location handling code in mlir-opt to have the non-error expecting form of mlir-opt invocations to report error locations precisely. - Fix parser to detect verifier failures and report them through errorReporter instead of printing the error and crashing. This regresses the location info for verifier errors in the parser that were previously ascribed to the function. This will get resolved in future patches by adding support for function attributes, which we can use to manage location information. PiperOrigin-RevId: 209600980	2019-03-29 13:03:11 -07:00
Jacques Pienaar	ff6daf98fe	Add custom lilith script. Add custom lilith script with paths set to MLIR tools directories to simplify lit test. PiperOrigin-RevId: 209486232	2019-03-29 13:02:57 -07:00
Chris Lattner	9265197c4e	Implement initial support for function attributes, including parser, printer, resolver support. Still TODO are verifier support (to make sure you don't use an attribute for a function in another module) and the TODO in ModuleParser::finalizeModule that I will handle in the next patch. PiperOrigin-RevId: 209361648	2019-03-29 13:02:44 -07:00
Chris Lattner	ae79d69922	Implement a module-level symbol table for functions, enforcing uniqueness of names across the module and auto-renaming conflicts. Have the parser reject malformed modules that have redefinitions. PiperOrigin-RevId: 209227560	2019-03-29 13:02:30 -07:00
Chris Lattner	2278bcc891	Add support for floating point constants, fixing b/112707848. This also adds string attribute support. PiperOrigin-RevId: 209074362	2019-03-29 13:01:35 -07:00
Uday Bondhugula	98a24881d3	ShortLoopUnroll - bug fix. Collect loops through a post order walk instead of a pre-order so that loops are collected from inner loops are collected before outer surrounding ones. Add a complex test case. PiperOrigin-RevId: 209041057	2019-03-29 13:01:22 -07:00
MLIR Team	f962e628e3	Adds dealloc MLIR memory operation to StandardOps. PiperOrigin-RevId: 208896071	2019-03-29 13:00:50 -07:00
Chris Lattner	d6c4c748d7	Escape and unescape strings in the parser and printer so they can roundtrip, print floating point in a structured form that we know can round trip, enumerate attributes in the visitor so we print affine mapping attributes symbolically (the majority of the testcase updates). We still have an issue where the hexadecimal floating point syntax is reparsed as an integer, but that can evolve in subsequent patches. PiperOrigin-RevId: 208828876	2019-03-29 13:00:05 -07:00
James Molloy	ab60afb234	[mlir] Allow C-style escapes in Lexer This patch passes the raw, unescaped value through to the rest of the stack. Partial escaping is a total pain to deal with, so we either need to implement escaping properly (ideally using a third party library like absl, I don't think LLVM has one that can handle the proper gamut of escape codes) or don't escape. I chose the latter for this patch. PiperOrigin-RevId: 208608945	2019-03-29 12:59:32 -07:00
Tatiana Shpeisman	4e289a4700	Implement return statement as RetOp operation. Add verification of the return statement placement and operands. Add parser and parsing error tests for return statements with non-zero number of operands. Add a few missing tests for ForStmt parsing errors. Prior to this CL, return statement had no explicit representation in MLIR. Now, it is represented as ReturnOp standard operation and is pretty printed according to the return statement syntax. This way statement walkers can process ML function return operands without making special case for them. PiperOrigin-RevId: 208092424	2019-03-29 12:58:04 -07:00
Uday Bondhugula	8a663870e8	Support for affine integer sets - introduce affine integer sets into the IR - parse and print affine integer sets (both inline or outlined) similar to affine maps - use integer set for IfStmt's conditional, and implement parsing of IfStmt's conditional - fixed an affine expr paren omission bug while one this. TODO: parse/represent/print MLValue operands to affine integer set references. PiperOrigin-RevId: 207779408	2019-03-29 12:56:58 -07:00
Chris Lattner	17ef97bf7e	Refactor the asmparser hook to work with a new OperationState type that fully encapsulates an operation that is yet to be created. This is a patch towards custom ops providing create methods that don't need to be templated, allowing them to move out of line in the future. PiperOrigin-RevId: 207725557	2019-03-29 12:56:30 -07:00
Uday Bondhugula	d8490d8d4f	Loop unrolling pass update - fix/complete forStmt cloning for unrolling to work for outer loops - create IV const's only when needed - test outer loop unrolling by creating a short trip count unroll pass for loops with trip counts <= <parameter> - add unrolling test cases for multiple op results, outer loop unrolling - fix/clean up StmtWalker class while on this - switch unroll loop iterator values from i32 to affineint PiperOrigin-RevId: 207645967	2019-03-29 12:56:16 -07:00
Tatiana Shpeisman	a0a6414ca2	Implement ML function arguments. Add representation for argument list in ML Function using TrailingObjects template. Implement argument iterators, parsing and printing. Unrelated minor change - remove OperationStmt::dropReferences(). Since MLFunction does not have cyclic operand references (it's an AST) destruction can be safely done w/o a special pass to drop references. PiperOrigin-RevId: 207583024	2019-03-29 12:55:47 -07:00
Chris Lattner	ed9fa46413	Continue wiring up diagnostic reporting infrastructure, still WIP. - Implement a diagnostic hook in one of the paths in mlir-opt which captures and reports the diagnostics nicely. - Have the parser capture simple location information from the parser indicating where each op came from in the source .mlir file. - Add a verifyDominance() method to MLFuncVerifier to demo this, resolving b/112086163 - Add some PrettyStackTrace handlers to make crashes in the testsuite easier to track down. PiperOrigin-RevId: 207488548	2019-03-29 12:55:34 -07:00
Uday Bondhugula	65b6e73245	Loop unrolling update. - deal with non-operation stmt's (if/for stmt's) in loops being unrolled (unrolling of non-innermost loops works). - update uses in unrolled bodies to use results of new operations that may be introduced in the unrolled bodies. Unrolling now works for all kinds of loop nests - perfect nests, imperfect nests, loops at any depth, and with any kind of operation in the body. (IfStmt support not done, hence untested there). Added missing dump/print method for StmtBlock. TODO: add test case for outer loop unrolling. PiperOrigin-RevId: 207314286	2019-03-29 12:55:19 -07:00
James Molloy	72645b31b8	[mlir] Add a TypeAttr class, allow type attributes PiperOrigin-RevId: 207235956	2019-03-29 12:54:11 -07:00
Chris Lattner	fc1f223447	Have the asmprinter give true/false constants nice names, add a dump/print method to SSAValue. PiperOrigin-RevId: 207193088	2019-03-29 12:53:44 -07:00
Chris Lattner	316e884367	Give custom ops the ability to also access general additional attributes in the parser and printer. Fix the spelling of 'delimeter' PiperOrigin-RevId: 207189892	2019-03-29 12:53:31 -07:00
James Molloy	6472f5fbbb	[mlir] Fix ReturnInst printing for zero operands No longer prints a trailing ':'. PiperOrigin-RevId: 207103812	2019-03-29 12:53:17 -07:00
Uday Bondhugula	2a003256ae	MLStmt cloning and IV replacement for loop unrolling, add constant pool to MLFunctions. - MLStmt cloning and IV replacement - While at this, fix the innermostLoopGatherer to actually gather all the innermost loops (it was stopping its walk at the first innermost loop it found) - Improve comments for MLFunction statement classes, fix inheritance order. - Fixed StmtBlock destructor. PiperOrigin-RevId: 207049173	2019-03-29 12:53:02 -07:00
Uday Bondhugula	b92378e8fa	More simplification for affine binary op expr's. - simplify operations with identity elements (multiply by 1, add with 0). - simplify successive add/mul: fold constants, propagate constants to the right. - simplify floordiv and ceildiv when divisors are constants, and the LHS is a multiply expression with RHS constant. - fix an affine expression printing bug on paren emission. - while on this, fix affine-map test cases file (memref's using layout maps that were duplicates of existing ones should be emitted pointing to the unique'd one). PiperOrigin-RevId: 207046738	2019-03-29 12:52:48 -07:00
Chris Lattner	8eaf382734	Use SFINAE to generalize << overloads, give 'constant' a pretty form, generalize the asmprinters handling of pretty names to allow arbitrary sugar to be dumped on various constructs. Give CFG function arguments nice "arg0" names like MLFunctions get, and give constant integers pretty names like %c37 for a constant 377 PiperOrigin-RevId: 206953080	2019-03-29 12:52:07 -07:00
Tatiana Shpeisman	8189a12bce	Clean up and extend MLFuncBuilder to allow creating statements in the middle of a statement block. Rename Statement::getFunction() and StmtBlock()::getFunction() to findFunction() to make it clear that this is not a constant time getter. Fix b/112039912 - we were recording 'i' instead of '%i' for loop induction variables causing "use of undefined SSA value" error. PiperOrigin-RevId: 206884644	2019-03-29 12:51:38 -07:00
Chris Lattner	5228ec3146	Fix some issues where we weren't printing affine map references symbolically. Two problems: 1) we didn't visit the types in ops correctly, and 2) the general "T" version of the OpAsmPrinter inserter would match things like MemRefType& and print it directly. PiperOrigin-RevId: 206863642	2019-03-29 12:51:25 -07:00
Jacques Pienaar	1015a0dded	Add parsing for floating point attributes. This is doing it in a suboptimal manner by recombining [integer period literal] into a string literal and parsing that via to_float. PiperOrigin-RevId: 206855106	2019-03-29 12:51:12 -07:00
Chris Lattner	ace4df1200	Revise the AffineExpr printing logic to be more careful about paren emission. This is still (intentionally) generating redundant parens for nested tightly binding expressions, but I think that is reasonable for readability sake. This also print x-y instead of x-(y*1) PiperOrigin-RevId: 206847212	2019-03-29 12:50:59 -07:00
MLIR Team	d86068203b	Adds a standard op for MLIR 'store' instruction. PiperOrigin-RevId: 206824609	2019-03-29 12:50:06 -07:00
Tatiana Shpeisman	c8b0273f19	Implement induction variables. Pretty print induction variable operands as %i<ssa value number>. Add support for future pretty printing of ML function arguments as %arg<ssa value number>. Induction variables are implemented by inheriting ForStmt from MLValue. ForStmt provides APIs that make this design decision invisible to the ForStmt users. This CL in combination with cl/206253643 resolves http://b/111769060. PiperOrigin-RevId: 206655937	2019-03-29 12:49:36 -07:00
MLIR Team	d48790cc52	Add standard op for MLIR 'alloc' instruction (with parser and associated tests). Adds field to MemRefType to query number of dynamic dimensions. PiperOrigin-RevId: 206633162	2019-03-29 12:49:10 -07:00
Chris Lattner	782c348c00	Change mlir-opt.cpp to take a list of passes to run, simplifying the driver code. Change printing of affine map's to not print a space between the dim and symbol list. PiperOrigin-RevId: 206505419	2019-03-29 12:47:38 -07:00
Chris Lattner	9128a4aa87	Finish parser/printer support for AffineMapOp, implement operand iterators on VariadicOperands, tidy up some code in the asmprinter, fill out more verification logic in for LoadOp. PiperOrigin-RevId: 206443020	2019-03-29 12:47:11 -07:00
Chris Lattner	c77f39f55c	Eliminate "primitive" types from being a thing, splitting them into FloatType and OtherType. Other type is now the thing that holds AffineInt, Control, eventually Resource, Variant, String, etc. FloatType holds the floating point types, and allows convenient query of isa<FloatType>(). This fixes issues where we allowed control to be the element type of tensor, memref, vector. At the same time, ban AffineInt from being an element of a vector/memref/tensor as well since we don't need it. I updated the spec to match this as well. PiperOrigin-RevId: 206361942	2019-03-29 12:46:57 -07:00
Chris Lattner	6e89270b2d	Implement support for predecessor iterators on basic blocks, use them to print out predecessor information in the asmprinter. PiperOrigin-RevId: 206343174	2019-03-29 12:46:44 -07:00
Tatiana Shpeisman	9ebd3c7df8	Implement MLValue, statement operands, operation statement operands and values. ML functions now have full support for expressing operations. Induction variables, function arguments and return values are still todo. PiperOrigin-RevId: 206253643	2019-03-29 12:46:04 -07:00
Chris Lattner	50f89b4188	Fix FIXME's/TODOs: - Enhance memref type to allow omission of mappings and address spaces (implying a default mapping). - Fix printing of function types to properly recurse with printType so mappings are printed by name. - Simplify parsing of AffineMaps a bit now that we have isSymbolicOrConstant() PiperOrigin-RevId: 206039755	2019-03-29 12:43:42 -07:00
Chris Lattner	b67fc6c422	Implement custom parser support for operations, enhance dim/addf to use it, and add a new load op. This regresses parser error recovery in some cases (in invalid.mlir) which I'll consider in a follow-up patch. The important thing in this patch is that the parse methods in StandardOps.cpp are nice and simple. PiperOrigin-RevId: 206023308	2019-03-29 12:43:28 -07:00
Uday Bondhugula	e866f57730	Unique AffineDimExpr, AffineSymbolExpr, AffineConstantExpr, and allocate these from the bump pointer allocator. - delete AffineExpr destructors. PiperOrigin-RevId: 205943807	2019-03-29 12:43:15 -07:00
Uday Bondhugula	a0abd666a7	Sketch out loop unrolling transformation. - Implement a full loop unroll for innermost loops. - Use it to implement a pass that unroll all the innermost loops of all mlfunction's in a module. ForStmt's parsed currently have constant trip counts (and constant loop bounds). - Implement StmtVisitor based (Visitor pattern) Loop IVs aren't currently parsed and represented as SSA values. Replacing uses of loop IVs in unrolled bodies is thus a TODO. Class comments are sparse at some places - will add them after one round of comments. A cmd-line flag triggers this for now. Original: mlfunc @loops() { for x = 1 to 100 step 2 { for x = 1 to 4 { "Const"(){value: 1} : () -> () } } return } After unrolling: mlfunc @loops() { for x = 1 to 100 step 2 { "Const"(){value: 1} : () -> () "Const"(){value: 1} : () -> () "Const"(){value: 1} : () -> () "Const"(){value: 1} : () -> () } return } PiperOrigin-RevId: 205933235	2019-03-29 12:43:01 -07:00
MLIR Team	f44636f03d	Adds VariadicOperands and VariadicResult traits to OperationImpl. Uses these in AffineApplyOp verification (with tests). PiperOrigin-RevId: 205921877	2019-03-29 12:42:47 -07:00
Chris Lattner	b5cdf60477	Expose custom asmprinter support to core operations and have them adopt it, fixing the printing syntax for dim, constant, fadd, etc. PiperOrigin-RevId: 205908627	2019-03-29 12:42:08 -07:00
James Molloy	f7f70ee691	[mlir] Implement conditional branch This looks heavyweight but most of the code is in the massive number of operand accessors! We need to be able to iterate over all operands to the condbr (all live-outs) but also just the true/just the false operands too. PiperOrigin-RevId: 205897704	2019-03-29 12:41:55 -07:00
Chris Lattner	6cab858405	Allow 'constant' op to work with affineint, add some accessors, rearrange testsuite a bit. PiperOrigin-RevId: 205852871	2019-03-29 12:41:29 -07:00
Tatiana Shpeisman	1b24c48b91	Scaffolding for convertToCFG pass that replaces all instances of ML functions with equivalent CFG functions. Traverses module MLIR, generates CFG functions (empty for now) and removes ML functions. Adds Transforms library and tests. PiperOrigin-RevId: 205848367	2019-03-29 12:41:15 -07:00
MLIR Team	b14d0189e8	Adds newly renamed "affine_apply" operation to StandardOps. Breaks "core operations" tests out into their own test file. PiperOrigin-RevId: 205848090	2019-03-29 12:41:00 -07:00
James Molloy	4db2ee5f1b	[mlir] Fix a use-after-free iterator error found by asan While fixing this the parser-affine-map.mlir test started failing due to ordering of the printed affine maps. Even the existing CHECK-DAGs weren't enough to disambiguate; a partial match on one line precluded a total match on a following line. The fix for this was easy - print the affine maps in reference order rather than in DenseMap iteration order. PiperOrigin-RevId: 205843770	2019-03-29 12:40:47 -07:00
Chris Lattner	0ab2e2536a	Enhance the customizable "Op" implementations in a bunch of ways: - Op classes can now provide customized matchers, allowing specializations beyond just a name match. - We now provide default implementations of verify/print hooks, so Op classes only need to implement them if they're doing custom stuff, and only have to implement the ones they're interested in. - "Base" now takes a variadic list of template template arguments, allowing concrete Op types to avoid passing the Concrete type multiple times. - Add new ZeroOperands trait. - Add verification hooks to Zero/One/Two operands and OneResult to check that ops using them are correctly formed. - Implement getOperand hooks to zero/one/two operand traits, and getResult/getType hook to OneResult trait. - Add a new "constant" op to show some of this off, with a specialization for the constant case. This patch also splits op validity checks out to a new test/IR/invalid-ops.mlir file. This stubs out support for default asmprinter support. My next planned patch building on top of this will make asmprinter hooks real and will revise this. PiperOrigin-RevId: 205833214	2019-03-29 12:40:34 -07:00
Chris Lattner	4331e5fe4c	Switch return instruction to take its operand list separated from its type list, for consistency with the rest of the language. Consolidate some parsing logic, add operand iterators to BranchInst. PiperOrigin-RevId: 205699457	2019-03-29 12:39:51 -07:00
Jacques Pienaar	0b6b99667b	Vector types elementtype can be either PrimitiveType or IntegerType. Change the type of elementType and remove the cast to PrimitiveType. PiperOrigin-RevId: 205698221	2019-03-29 12:39:38 -07:00
Chris Lattner	21ede32ff5	Implement support for branch instruction operands. PiperOrigin-RevId: 205666777	2019-03-29 12:38:45 -07:00
James Molloy	4144c302db	[mlir] Add basic block arguments This patch adds support for basic block arguments including parsing and printing. In doing so noticed that `ssa-id-and-type` is undefined in the MLIR spec; suggested an implementation in the spec doc. PiperOrigin-RevId: 205593369	2019-03-29 12:38:20 -07:00
Chris Lattner	e402dcc47f	Add support for operands to the return instructions, enhance verifier to report errors through the diagnostics system when invoked by the parser. It doesn't have perfect location info, but it is close enough to be testable. PiperOrigin-RevId: 205534392	2019-03-29 12:38:07 -07:00
Chris Lattner	3d2a24635e	Add support for multiple results to the printer/parser, add support for forward references to the parser, add initial support for SSA use-list iteration and RAUW. PiperOrigin-RevId: 205484031	2019-03-29 12:37:54 -07:00
Chris Lattner	a798b021f9	Teach the asmprinter to print out operands for OperationInst's. This is still limited in several ways, which i'll build out in subsequent patches. Rename the accessor for inst operands/results to make the Operand/Result versions of these more obscure, allowing getOperand/getResult to traffic in values (which is what - by far - most clients actually care about). PiperOrigin-RevId: 205408439	2019-03-29 12:37:00 -07:00
Uday Bondhugula	6d242fcf4b	Simplify affine binary op expression class hierarchy - Drop sub-classing of affine binary op expressions. - Drop affine expr op kind sub. Represent it as multiply by -1 and add. This will also be in line with the math form when we'll need to represent a system of linear equalities/inequalities: the negative number goes into the coefficient of an affine form. (For eg. x_1 + (-1)x_2 + 3x_3 + (-2) >= 0). The folding simplification will transparently deal with multiplying the -1 with any other constants. This also means we won't need to simplify a multiply expression like in x_1 + (-2)x_2 to a subtract expression (x_1 - 2x_2) for canonicalization/uniquing. - When we print the IR, we will still pretty print to a subtract when possible. PiperOrigin-RevId: 205298958	2019-03-29 12:36:46 -07:00
Tatiana Shpeisman	6ada91db02	Parse ML function arguments, return statement operands, and for statement loop header. Loop bounds and presumed to be constants for now and are stored in ForStmt as affine constant expressions. ML function arguments, return statement operands and loop variable name are dropped for now. PiperOrigin-RevId: 205256208	2019-03-29 12:36:20 -07:00
Chris Lattner	72c24e3e71	Add basic parser support for operands: - This introduces a new FunctionParser base class to handle logic common between the kinds of functions we have, e.g. ssa operand/def parsing. - This introduces a basic symbol table (without support for forward references!) and links defs and uses. - CFG functions now parse and build operand lists for operations. The printer isn't set up for them yet tho. PiperOrigin-RevId: 205246110	2019-03-29 12:36:08 -07:00
MLIR Team	f1e039617b	Support for AffineMapAttr. PiperOrigin-RevId: 205157390	2019-03-29 12:35:40 -07:00
Chris Lattner	b3fa7d0e9f	Initial support for operands and results and SSA constructs, first on the instruction side of the house. This has a number of limitations, including that we are still dropping operands on the floor in the parser. Also, most of the convenience methods aren't wired up yet. This is enough to get result type lists round tripping through. PiperOrigin-RevId: 205148223	2019-03-29 12:35:28 -07:00
MLIR Team	fa75d6210e	Adds ModuleState to support printing outlined AffineMaps. PiperOrigin-RevId: 204999887	2019-03-29 12:35:00 -07:00
Tatiana Shpeisman	fc7d6dbe5e	Parse operations in ML functions. Add builder class for ML functions. Refactors operation parsing to share functionality between CFG and ML functions. ML function construction now goes through a builder, similar to the way it is done for CFG functions. PiperOrigin-RevId: 204779279	2019-03-29 12:34:34 -07:00
MLIR Team	8e8114a96d	Adds MemRef type and adds support for parsing memref affine map composition. PiperOrigin-RevId: 204756982	2019-03-29 12:34:20 -07:00
Chris Lattner	c4f35a6605	Switch the comment syntax from ; to // comments as discussed on Friday. There is no strong reason to prefer one or the other, but // is nice for consistency given the rest of the compiler is written in C++. PiperOrigin-RevId: 204628476	2019-03-29 12:33:54 -07:00
Tatiana Shpeisman	8efc06dc2c	Refactor implementation of Statement class heirarchy to use statement block. Use LLVM double-link with parent list to store statements within a block. PiperOrigin-RevId: 204515541	2019-03-29 12:33:28 -07:00
Uday Bondhugula	8fbaf79afb	Parse affine map range sizes. PiperOrigin-RevId: 204240947	2019-03-29 12:32:59 -07:00
Uday Bondhugula	b488a035aa	Implement some simple affine expr canonicalization/simplification. - fold constants when possible. - for a mul expression, canonicalize to always keep the LHS as the constant/symbolic term, and similarly, the RHS for an add expression to keep it closer to the mathematical form. (Eg: f(x) = 3x + 5)); other similar simplifications; - verify binary op expressions at creation time. TODO: we can completely drop AffineSubExpr, and instead use add and mul by -1. This way something like x - 4 and -4 + x get canonicalized to x + -1 4 instead of being x - 4 and x + -4. (The other alternative if wanted to retain AffineSubExpr would be to simplify x + -1*y to x - y and x + <neg number> to x - <pos number>). PiperOrigin-RevId: 204240258	2019-03-29 12:32:45 -07:00
Jacques Pienaar	4b6bf08b3b	Remove const reference to errorReporter. Fixes use-after-free ASAN failure. PiperOrigin-RevId: 204177796	2019-03-29 12:32:32 -07:00
Uday Bondhugula	178fd24813	AffineMap/AffineExpr: delete copy constructor/assignment, refactor affine expr parsing. - also make error messages uniform PiperOrigin-RevId: 203822686	2019-03-29 12:31:17 -07:00
Uday Bondhugula	fc46bcf51d	Complete affine expr parsing support - check for non-affine expressions - handle negative numbers and negation of id's, expressions - functions to check if a map is pure affine or semi-affine - simplify/clean up affine map parsing code - report more errors messages, more accurate error messages PiperOrigin-RevId: 203773633	2019-03-29 12:31:03 -07:00
Jacques Pienaar	c90de70329	Expand check-parser-errors to match multiple errrors per line. * check-parser-errors can match multiple errors per line; * Add offset notation to expected-error; PiperOrigin-RevId: 203625348	2019-03-29 12:30:35 -07:00
Chris Lattner	9d869ea76d	Add basic lexing and parsing support for SSA operands and definitions. This isn't actually constructing IR objects yet, it is eating the tokens and discarding them. PiperOrigin-RevId: 203616265	2019-03-29 12:30:22 -07:00
Chris Lattner	67c03193de	Implement a simple IR verifier, including support for custom ops adding their own requirements. PiperOrigin-RevId: 203497491	2019-03-29 12:29:55 -07:00
Chris Lattner	9e0e01b47a	Implement Uday's suggestion to unique attribute lists across instructions, reducing the memory impact on Operation to one word instead of 3 from an std::vector. Implement Jacques' suggestion to merge OpImpl::Storage into OpImpl::Base. PiperOrigin-RevId: 203426518	2019-03-29 12:29:42 -07:00
Chris Lattner	1928e20a56	Add the ability to have "Ops" defined as small C++ classes, with some nice properties: - They allow type checked dynamic casting from their base Operation. - They allow nice accessors for C++ clients, e.g. a "getIndex()" method on 'dim' that returns an unsigned. - They work with both OperationInst/OperationStmt (once OperationStmt is implemented). - They get custom printing logic. They will eventually get custom parsing, verifier, and builder logic as well. - Out of tree clients can register their own operation set without having to change MLIR core, e.g. for TensorFlow or custom target instructions. This registers addf and dim as examples. PiperOrigin-RevId: 203382993	2019-03-29 12:29:29 -07:00
Chris Lattner	b0dabbd67f	Add parsing for attributes and attibutes on operations. Add IR representation for attributes on operations. Split Operation out from OperationInst so it can be shared with OperationStmt one day. PiperOrigin-RevId: 203325366	2019-03-29 12:29:16 -07:00
Uday Bondhugula	3dc4fb6f0f	Parsing support for affine maps and affine expressions A recursive descent parser for affine maps/expressions with operator precedence and associativity. (While on this, sketch out uniqui'ing functionality for affine maps and affine binary op expressions (partly).) PiperOrigin-RevId: 203222063	2019-03-29 12:28:22 -07:00
Tatiana Shpeisman	177ce7215c	Basic representation and parsing of if and for statements. Loop headers and if statement conditions are not yet supported. PiperOrigin-RevId: 203211526	2019-03-29 12:28:10 -07:00
Chris Lattner	6af866c58d	Enhance the type system to support arbitrary precision integers, which are important for low-bitwidth inference cases and hardware synthesis targets. Rename 'int' to 'affineint' to avoid confusion between "the integers" and "the int type". PiperOrigin-RevId: 202751508	2019-03-29 12:27:32 -07:00
Uday Bondhugula	fdf7bc4e25	[WIP] Sketching IR and parsing support for affine maps, affine expressions Run test case: $ mlir-opt test/IR/parser-affine-map.mlir test/IR/parser-affine-map.mlir:3:30: error: expect '(' at start of map range #hello_world2 (i, j) [s0] -> i+s0, j) ^ PiperOrigin-RevId: 202736856	2019-03-29 12:27:20 -07:00
Chris Lattner	1734d78f88	Sketch out parser/IR support for OperationInst, and a new Instruction base class. Introduce an Identifier class to MLIRContext to represent uniqued identifiers, introduce string literal support to the lexer, introducing parser and printer support etc. PiperOrigin-RevId: 202592007	2019-03-29 12:26:53 -07:00
Tatiana Shpeisman	3609599af6	Introduce IR and parser support for ML functions. Representing function arguments is still TODO. Supporting instructions other than return is also TODO. PiperOrigin-RevId: 202570934	2019-03-29 12:26:41 -07:00
Jacques Pienaar	39a33a2568	Change error verification of parser error checking. Change from using FileCheck to directly verifying the message (simple substring checking) and line number of the error. PiperOrigin-RevId: 201955181	2019-03-29 12:26:02 -07:00
Chris Lattner	2b6684cfbe	Add the unconditional branch instruction, improve diagnostics for block references. PiperOrigin-RevId: 201872745	2019-03-29 12:25:35 -07:00
Jacques Pienaar	a5fb2f47e1	Add negative parsing tests using mlir-opt. Add parsing tests with errors. Follows direct path of splitting file into test groups (using a marker) and parsing each section individually. The expected errors are checked using FileCheck and parser error does not result in terminating parsing the rest of the file if check-parser-error. This is an interim approach until refactoring lexer/parser. PiperOrigin-RevId: 201867941	2019-03-29 12:25:23 -07:00
MLIR Team	642f3e8847	Add tensor type. PiperOrigin-RevId: 201830793	2019-03-29 12:24:58 -07:00
Chris Lattner	80b6bd24b3	Implement parser/IR support for CFG functions, basic blocks and return instruction. This is pretty much minimal scaffolding for this step. Basic block arguments, instructions, other terminators, a proper IR representation for blocks/instructions, etc are all coming. PiperOrigin-RevId: 201826439	2019-03-29 12:24:45 -07:00
Chris Lattner	49795d166f	Introduce IR support for MLIRContext, primitive types, function types, and vector types. tensors and memref types are still TODO, and would be a good starter project for someone. PiperOrigin-RevId: 201782748	2019-03-29 12:24:32 -07:00
Chris Lattner	23b784a1bb	Implement parser and lexer support for most of the type grammar. Semi-affine maps and address spaces are not yet supported (someone want to take this on?). We also don't generate IR objects for types yet, which I plan to tackle next. PiperOrigin-RevId: 201754283	2019-03-29 12:24:20 -07:00
Chris Lattner	9b9f7ff5d4	Implement enough of a lexer and parser for MLIR to parse extfunc's without arguments. PiperOrigin-RevId: 201706570	2019-03-29 12:24:05 -07:00
Chris Lattner	5fc587ecf8	Continue sketching out basic infrastructure, including an input and output filename, and printing of trivial stuff. There is no parser yet, so the input file is ignored. PiperOrigin-RevId: 201596916	2019-03-29 12:23:51 -07:00
MLIR Team	80a03c80a9	[MLIR] Enable lit test driver for simple check test. PiperOrigin-RevId: 201554536	2019-03-29 12:23:38 -07:00
Chris Lattner	9603f9fe35	Sketch out a new repository for the mlir project (go/mlir). PiperOrigin-RevId: 201540159	2019-03-29 12:23:24 -07:00

... 97 98 99 100 101 ...

5226 Commits