llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Gysi	b6e7b1be73	[mlir][linalg] Simplify padding test (NFC). The padding tests previously contained the tile loops. This revision removes the tile loops since padding itself does not consider the loops. Instead the induction variables are passed in as function arguments which promotes them to symbols in the affine expressions. Note that the pad-and-hoist.mlir test still exercises padding in the context of the full loop nest. Depends On D114175 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114227	2021-11-24 19:21:50 +00:00
Tobias Gysi	86f186efea	[mlir][linalg] Add makeComposedPadHighOp. Add the makeComposedPadHighOp method which creates a new PadTensorOp if necessary. If the source to pad is actually the result of a sequence of padded LinalgOps, the method checks if padding is needed or if we can use the padded result of the padded LinalgOp sequence directly. Example: ``` %0 = tensor.extract_slice %arg0 [%iv0, %iv1] [%sz0, %sz1] %1 = linalg.pad_tensor %0 low[0, 0] high[...] { linalg.yield %cst } %2 = linalg.matmul ins(...) outs(%1) %3 = tensor.extract_slice %2 [0, 0] [%sz0, %sz1] ``` when padding %3 return %2 instead of introducing ``` %4 = linalg.pad_tensor %3 low[0, 0] high[...] { linalg.yield %cst } ``` Depends On D114161 Reviewed By: nicolasvasilache, pifon2a Differential Revision: https://reviews.llvm.org/D114175	2021-11-24 19:18:59 +00:00
Tobias Gysi	a4fd8cb76f	[mlir][linalg] Update failure conditions for padOperandToSmallestStaticBoundingBox. Change the failure condition of padOperandToSmallestStaticBoundingBox to never fail if the operand is already statically sized. In particular: - if the padding value computation fails -> return failure if the operand shape is dynamic and success if it is static. - if there is no extract slice op -> return failure if the operand shape is dynamic and success if it is static. The latter change prevents padding from failure if the output operand passed by iteration argument is statically sized since in this case the extract / insert slice pairs are removed by canonicalization. Depends On D114153 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114161	2021-11-24 19:10:50 +00:00
Butygin	7f5d9bf13a	[mlir][scf] Canonicalize scf.while with unused results Differential Revision: https://reviews.llvm.org/D114291	2021-11-24 11:11:22 +03:00
Rob Suderman	0f1e52afa9	[mlir][tosa] Materialize tosa.pad value and fold noop pads Padding now can explicitly specify the padding value when non-zero is wanted. This also includes bypassing pads when the pad does nothing. Differential Revision: https://reviews.llvm.org/D113611	2021-11-23 12:23:42 -08:00
Rob Suderman	54eec7cafc	[mlir][tosa] Separate tosa.transpose_conv decomposition and added stride support Transpose convolution decomposition is now performed in a separate pass. This allows padding / constant propagation to be performed at the TOSA level. It also adds support for striding when there is no dilation. Differential Revision: https://reviews.llvm.org/D114409	2021-11-23 12:16:44 -08:00
Nicolas Vasilache	3ff4e5f2a4	[mlir][Vector] Thread 0-d vectors through InsertElementOp. This revision makes concrete use of 0-d vectors to extend the semantics of InsertElementOp. Reviewed By: dcaballe, pifon2a Differential Revision: https://reviews.llvm.org/D114388	2021-11-23 12:55:11 +00:00
Nicolas Vasilache	e7026aba00	[mlir][Vector] Thread 0-d vectors through ExtractElementOp. This revision starts making concrete use of 0-d vectors to extend the semantics of ExtractElementOp. In the process a new VectorOfAnyRank Tablegen OpBase.td is added to allow progressive transition to supporting 0-d vectors by gradually opting in. Differential Revision: https://reviews.llvm.org/D114387	2021-11-23 12:39:44 +00:00
Nicolas Vasilache	b2729fda60	[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm) This revision follows up on the conversation titled: ```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths``` The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation. This results in roughly 20% fewer cycles as reported by llvm-mca: After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted): ``` Iterations: 100 Instructions: 5900 Total Cycles: 2415 Total uOps: 7300 Dispatch Width: 6 uOps Per Cycle: 3.02 IPC: 2.44 Block RThroughput: 24.0 Cycles with backend pressure increase [ 89.90% ] Throughput Bottlenecks: Resource Pressure [ 89.65% ] - SKXPort1 [ 0.04% ] - SKXPort2 [ 12.42% ] - SKXPort3 [ 12.42% ] - SKXPort5 [ 89.52% ] Data Dependencies: [ 37.06% ] - Register Dependencies [ 37.06% ] - Memory Dependencies [ 0.00% ] ``` After this revision (inline_asm version, vblendps instructions are indeed emitted): ``` Iterations: 100 Instructions: 6300 Total Cycles: 2015 Total uOps: 7700 Dispatch Width: 6 uOps Per Cycle: 3.82 IPC: 3.13 Block RThroughput: 20.0 Cycles with backend pressure increase [ 83.47% ] Throughput Bottlenecks: Resource Pressure [ 83.18% ] - SKXPort0 [ 14.49% ] - SKXPort1 [ 14.54% ] - SKXPort2 [ 19.70% ] - SKXPort3 [ 19.70% ] - SKXPort5 [ 83.03% ] - SKXPort6 [ 14.49% ] Data Dependencies: [ 39.75% ] - Register Dependencies [ 39.75% ] - Memory Dependencies [ 0.00% ] ``` An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0). Differential Revision: https://reviews.llvm.org/D114393	2021-11-23 07:31:22 +00:00
Matthias Springer	fb99686bfd	[mlir][linalg][bufferize] Limited support for scf.execute_region Add support for analysis only. Differential Revision: https://reviews.llvm.org/D114055	2021-11-23 12:20:39 +09:00
Benjamin Kramer	966b720983	[mlir][memref] Fix expanded shape ops memref.cast folding with changed type `memref.expand_shape` has verification logic to make sure result dim must be static if all the collapsing src dims are static. This can be relaxed once expand_shape supports more dynamism. Differential Revision: https://reviews.llvm.org/D114391	2021-11-22 22:56:15 +01:00
Mehdi Amini	e0b7bee7cf	Revert "[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)" This reverts commit `a9e236bed8`. This broke the Windows build: mlir\include\mlir/Dialect/X86Vector/Transforms.h(28): error C2061: syntax error: identifier 'uint'	2021-11-22 19:23:18 +00:00
Lei Zhang	93284120f2	[mlir][vector] Fix TransferOpReduceRank for 0-D tensors We cannot unconditionally generate memref.load ops for such cases; need to check the source's type. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114376	2021-11-22 12:30:46 -05:00
Tobias Gysi	32c43241e7	[mlir][linalg] Always generate an extract/insert slice pair when tiling output tensors. Adapt tiling to always generate an extract/insert slice pair for output tensors even if the tensor is not tiled. Having an explicit extract/insert slice pair simplifies followup transformations such as padding and bufferization. In particular, it makes read and written iteration argument slices explicit. Depends On D114067 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114085	2021-11-22 13:12:43 +00:00
Tobias Gysi	e3d386ea27	[mlir][linalg] Add a tile and fuse on tensors pattern. Add a pattern to apply the new tile and fuse on tensors method. Integrate the pattern into the CodegenStrategy and use the CodegenStrategy to implement the tests. Depends On D114012 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114067	2021-11-22 11:13:21 +00:00
Tobias Gysi	0ccc44cec0	[mlir][linalg] Fix tile and fuse for outermost reduction. Tile and fuse failed if the outermost tile loop is a reduction dimension. Add the necessary check to handle outermost reductions and introduce a test case to verify the change. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114012	2021-11-22 10:44:15 +00:00
Nicolas Vasilache	a9e236bed8	[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm) This revision follows up on the conversation titled: ```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths``` The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation. This results in roughly 20% fewer cycles as reported by llvm-mca: After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted): ``` Iterations: 100 Instructions: 5900 Total Cycles: 2415 Total uOps: 7300 Dispatch Width: 6 uOps Per Cycle: 3.02 IPC: 2.44 Block RThroughput: 24.0 Cycles with backend pressure increase [ 89.90% ] Throughput Bottlenecks: Resource Pressure [ 89.65% ] - SKXPort1 [ 0.04% ] - SKXPort2 [ 12.42% ] - SKXPort3 [ 12.42% ] - SKXPort5 [ 89.52% ] Data Dependencies: [ 37.06% ] - Register Dependencies [ 37.06% ] - Memory Dependencies [ 0.00% ] ``` After this revision (inline_asm version, vblendps instructions are indeed emitted): ``` Iterations: 100 Instructions: 6300 Total Cycles: 2015 Total uOps: 7700 Dispatch Width: 6 uOps Per Cycle: 3.82 IPC: 3.13 Block RThroughput: 20.0 Cycles with backend pressure increase [ 83.47% ] Throughput Bottlenecks: Resource Pressure [ 83.18% ] - SKXPort0 [ 14.49% ] - SKXPort1 [ 14.54% ] - SKXPort2 [ 19.70% ] - SKXPort3 [ 19.70% ] - SKXPort5 [ 83.03% ] - SKXPort6 [ 14.49% ] Data Dependencies: [ 39.75% ] - Register Dependencies [ 39.75% ] - Memory Dependencies [ 0.00% ] ``` An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0). Reviewed By: ftynse, dcaballe Differential Revision: https://reviews.llvm.org/D114335	2021-11-22 10:32:34 +00:00
Jacques Pienaar	6f9cceb775	[mlir] Move trait to InferTypeOpInterface Step towards removing the hard coded behavior for this trait and to instead use common interface. Differential Revision: https://reviews.llvm.org/D114208	2021-11-21 14:41:12 -08:00
Arnab Dutta	ec7b0d4d34	[MLIR] Simplify Semi-affine expressions by rule based matching and replacing "expr - q * (expr floordiv q)" with "expr mod q" expression. Add rule based matching for detecting and transforming "expr - q * (expr floordiv q)" to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function. Reviewed By: bondhugula, dcaballe Differential Revision: https://reviews.llvm.org/D112985	2021-11-20 21:05:36 +05:30
Thomas Raoux	47555d73f6	[mlir][gpu] Extend shuffle op modes and add nvvm lowering Add up, down and idx modes to gpu shuffle ops, also change the mode from string to enum Differential Revision: https://reviews.llvm.org/D114188	2021-11-19 11:14:31 -08:00
Thomas Raoux	06dbb28569	[mlir][vector] Remove usage of shapecast to remove unit dim Instead of using shape_cast op in the pattern removing leading unit dimensions we use extract/broadcast ops. This is part of the effort to restrict ShapeCastOp fuirther in the future and only allow them to convert to or from 1D vector. This also adds extra canonicalization to fill the gaps in simplifying broadcast/extract ops. Differential Revision: https://reviews.llvm.org/D114205	2021-11-19 10:25:21 -08:00
Mogball	7c5ecc8b7e	[mlir][vector] Insert/extract element can accept index `vector::InsertElementOp` and `vector::ExtractElementOp` have had their `position` operand changed to accept `AnySignlessIntegerOrIndex` for better operability with operations that use `index`, such as affine loops. LLVM's `extractelement` and `insertelement` can also accept `i64`, so lowering directly to these operations without explicitly inserting casts is allowed. SPIRV's equivalent ops can also accept `i64`. Reviewed By: nicolasvasilache, jpienaar Differential Revision: https://reviews.llvm.org/D114139	2021-11-18 22:40:29 +00:00
MaheshRavishankar	526dfe3f4d	[mlir][Linalg] Do not return failure when all tile sizes are zero. Returning failure when tile sizes are all zero prevents the change in the marker. This makes pattern rewriter run the pattern multiple times only to exit when it hits a limit. Instead just clone the operation (since tiling is essentially cloning in this case). Then the transformation filter kicks in to avoid the pattern rewriter to be invoked many times. Differential Revision: https://reviews.llvm.org/D113949	2021-11-18 09:28:25 -08:00
Matthias Springer	ebf8d74e92	[mlir][linalg][bufferize] Fix bufferize bug where non-tensor ops are not skipped `BufferizableOpInterface::bufferize` will only be called on ops that have tensor operands and/or results. Differential Revision: https://reviews.llvm.org/D113962	2021-11-18 16:20:22 +09:00
Aart Bik	1ce77b562d	[mlir][sparse] refine lexicographic insertion to any tensor First version was vectors only. With some clever "path" insertion, we now support any d-dimensional tensor. Up next: reductions too Reviewed By: bixia, wrengr Differential Revision: https://reviews.llvm.org/D114024	2021-11-17 18:08:42 -08:00
Robert Suderman	6e41a06911	[mlir][tosa] Revert add-0 canonicalization for floating-point Floating point optimization can produce incorrect numerical resutls for -0.0 + 0.0 optimization as result needs to be -0.0. Reviewed By: eric-k256 Differential Revision: https://reviews.llvm.org/D114127	2021-11-17 17:29:57 -08:00
Rob Suderman	044e7e013e	[mlir][tosa] Fixed shape inference for tosa.transpose_conv2d Transpose conv2d shape inference was incorrect, tests did not properly validate that the shape inference was executing. Corrected shape inference, and extended tests to actually execute. Reviewed By: NatashaKnk Differential Revision: https://reviews.llvm.org/D114026	2021-11-17 14:59:52 -08:00
William S. Moses	30d87d4a5d	[MLIR][LLVM] Permit integer types in switch other than i32 LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D113955	2021-11-16 12:00:37 -05:00
Nicolas Vasilache	b377807a76	[mlir][LLVM] Fix folding of LLVM::ExtractValueOp Limit the backtracking along def-use chains when a prefix is encountered as it would generate incorrect foldings. Differential Revision: https://reviews.llvm.org/D113975	2021-11-16 14:49:05 +00:00
Butygin	6c48f6aafe	[mlir][spirv] add AtomicFAddEXTOp Differential Revision: https://reviews.llvm.org/D113764	2021-11-16 14:24:22 +03:00
Arnab Dutta	1402299271	[MLIR] Simplify semi-affine expressions using flattening For the semi affine expressions, whenever rhs of a floordiv, ceildiv, mod or product expression is a symbolic expression, we introduce a local variable representing the result, and store the floordiv/ceildiv, mod or product affine expression in LocalExprs. In this way the expression is flattened, and trivial addition and subtraction related simplifications are performed. Also rule based matching for detecting and transforming "expr - q * (expr floordiv q)" to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function. Differential Revision: https://reviews.llvm.org/D112808	2021-11-16 15:42:22 +05:30
Mehdi Amini	1585b13024	Revert "[MLIR][LLVM] Permit integer types in switch other than i32" This reverts commit `94992670fc`. Build is broken with: tools/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.cpp.inc:23996:3: error: no matching function for call to 'printSwitchOpCases' printSwitchOpCases(_odsPrinter, *this, getValue().getType(), getCaseValuesAttr(), getCaseDestinations(), getCaseOperands(), getCaseOperands().getTypes()); ^~~~~~~~~~~~~~~~~~	2021-11-16 05:59:12 +00:00
William S. Moses	94992670fc	[MLIR][LLVM] Permit integer types in switch other than i32 LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D113955	2021-11-16 00:46:25 -05:00
Aart Bik	f66e5769d4	[mlir][sparse] first version of "truly" dynamic sparse tensors as outputs of kernels This revision contains all "sparsification" ops and rewriting necessary to support sparse output tensors when the kernel has no reduction (viz. insertions occur in lexicographic order and are "injective"). This will be later generalized to allow reductions too. Also, this first revision only supports sparse 1-d tensors (viz. vectors) as output in the runtime support library. This will be generalized to n-d tensors shortly. But this way, the revision is kept to a manageable size. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D113705	2021-11-15 15:33:32 -08:00
not-jenni	cdb0623ad8	[mlir][tosa] Add tosa.mul by one canonicalization Multiply by one can be removed during canonicalization. This optimizes away unneeded operations. Differential Revision: https://reviews.llvm.org/D113807	2021-11-15 14:52:16 -08:00
Nicolas Vasilache	0b17336f79	[mlir][Vector] Make vector.shape_cast based size-1 foldings opt-in and separate. This is in prevision of dropping them altogether and using insert/extract based patterns. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D113928	2021-11-15 21:17:57 +00:00
Nicolas Vasilache	b828506eca	[mlir][Linalg] Add a DownscaleDepthwiseConv2DNhwcHwcOp decomposition pattern. Reviewed By: gysit Differential Revision: https://reviews.llvm.org/D113907	2021-11-15 20:48:16 +00:00
Nicolas Vasilache	641fe70776	[mlir][Linalg] Fix and improve vectorization of depthwise convolutions. When trying to connect the vectorization of depthwise convolutions to e2e execution a number of problems surfaced. Fix an off-by-one error on the size of the input vector (similary to what was previously done for regular conv). Rewrite the lowering to vector.fma instead of vector.contract: the KW reduction dimension has already been unrolled and vector.contract requires a reduction dimension to be valid. Differential Revision: https://reviews.llvm.org/D113884	2021-11-15 12:58:05 +00:00
Alexander Belyaev	9b1d90e8ac	[mlir] Move min/max ops from Std to Arith. Differential Revision: https://reviews.llvm.org/D113881	2021-11-15 13:19:17 +01:00
Nicolas Vasilache	f1c86b8354	[mlir][Linalg] Fix off-by-one error in conv vector size computation. Differential Revision: https://reviews.llvm.org/D113877	2021-11-15 11:37:44 +00:00
Matthias Springer	542a8cfba7	[mlir][linalg][bufferize] Fix insertion point of result buffers Differential Revision: https://reviews.llvm.org/D113723	2021-11-15 19:27:33 +09:00
Nicolas Vasilache	f67171ac58	[mlir][Linalg] Make depthwise convolution naming scheme consistent. Names should be consistent across all operations otherwise painful bugs will surface. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D113762	2021-11-15 07:54:29 +00:00
Nicolas Vasilache	99ff697bf7	[mlir][Vector] Add support for 1D depthwise conv vectorization At this time the 2 flavors of conv are a little too different to allow significant code sharing and other will likely come up. so we go the easy route first by duplicating and adapting. Reviewed By: gysit Differential Revision: https://reviews.llvm.org/D113758	2021-11-12 13:14:09 +00:00
Nicolas Vasilache	aa37318067	[mlir][Linalg] Rewrite DownscaleSizeOneWindowed2DConvolution to use rank-reducing insert/extract slices. This rewriting enables better bufferization and canonicalizations. Differential Revision: https://reviews.llvm.org/D113745	2021-11-12 11:57:12 +00:00
Nicolas Vasilache	8fd2f56c99	[mlir][Linalg] Add 1-d depthwise conv with opdsl Differential Revision: https://reviews.llvm.org/D113686	2021-11-11 17:49:26 +00:00
Stephan Herhut	b241226aec	[mlir][linalg] Avoid illegal elementwise fusion into reductions Fusing into a reduction is only valid if doing so does not erase information on a reduction dimensions size. Differential Revision: https://reviews.llvm.org/D113500	2021-11-11 15:56:12 +01:00
Nicolas Vasilache	34ff857350	[mlir][X86Vector] Add specialized vector.transpose lowering patterns for AVX2 This revision adds an implementation of 2-D vector.transpose for 4x8 and 8x8 for AVX2 and surfaces it to the Linalg level of control. Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D113347	2021-11-11 07:33:31 +00:00
lipracer	8165eaa885	[mlir](arithmetic) Add ceildivui to the arithmetic dialect The specific description is [[ https://llvm.discourse.group/t/adding-unsigned-integer-ceil-and-floor-in-std-dialect/4541 \| Adding unsigned integer ceil in Std Dialect ]] . When we lower ceilDivOp this will generate below code, sometimes we know m and n are unsigned intergal.Here are some redundant judgments about positive and negative. So we need to add some unsigned operations to simplify the instructions. ``` ceilDiv(n, m) x = (m > 0) ? -1 : 1 return (n*m>0) ? ((n+x) / m) + 1 : - (-n / m) ``` unsigned operations: ``` ceilDivU(n, m) return n ==0 ? 0 : ((n - 1) / m) + 1 ``` Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D113363	2021-11-11 01:49:14 +00:00
Matthias Springer	2e0d821bd5	[mlir][linalg][bufferize] Store analysis results in BufferizationAliasInfo * Store inplace bufferization decisions in `inplaceBufferized`. * Remove `InPlaceSpec`. Use a bool instead. * Use `BufferizableOpInterface::bufferizesToWritableMemory` and `bufferizesToWritableMemory` instead of `getInPlace(BlockArgument)`. The analysis does not care about inplacability of block arguments. It only cares whether the buffer can be written to or not. * The `kInPlaceResultsAttrName` op attribute is for testing purposes only. This commit further decouples BufferizationAliasInfo from other dialects such as SCF. Differential Revision: https://reviews.llvm.org/D113375	2021-11-11 10:36:49 +09:00
Matthias Springer	996d4ffe30	[mlir][linalg][bufferize] Fix bug in InitTensor elimination After replacing then init_tensor with a new value, the new value must be inserted into the corresponding union/equivalence sets. Differential Revision: https://reviews.llvm.org/D113374	2021-11-11 10:28:17 +09:00
Uday Bondhugula	51ae78a6d6	[MLIR][Affine][NFC] affine.store op verifier message fix and check Fix typo in affine.store op verifier message and test case. Differential Revision: https://reviews.llvm.org/D113360	2021-11-11 01:52:23 +05:30
Kevin Cheng	bef966eb37	tosa-make-broadcatable pass now supports numpy style broadcasting only. - fix bug that in [c,1] + [a, b, c, d] broadcast - add test [3,3,4,1] + [4,5] Signed-off-by: Kevin Cheng <kevin.cheng@arm.com> Change-Id: Iaed2f04df8775f655c82c740271395274163d147 Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D113596	2021-11-10 11:48:35 -08:00
Tobias Gysi	b326eb64fd	[mli][linalg] Use CodegenStrategy to test interchange (NFC). Use CodegenStrategy instead of a separate test pass to test iterator interchange. Depends On D113409 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D113550	2021-11-10 15:44:44 +00:00
Tobias Gysi	b676a67092	[mlir][linalg] Use CodegenStrategy to test hoisting (NFC). Use CodegenStrategy instead of a separate test pass to test hoisting. Depends On D113410 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D113411	2021-11-10 15:06:31 +00:00
Tobias Gysi	0c7c532643	[mli][linalg] Use CodegenStrategy to test padding (NFC). Use CodegenStrategy instead of a separate test pass to test padding. Depends On D113409 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D113410	2021-11-10 15:00:06 +00:00
Tobias Gysi	b86b2309ce	[mlir][linalg] Use AffineApplyOp to compute padding width (NFC). Use AffineApplyOp instead of SubIOp to compute the padding width when creating a pad tensor operation. Depends On D113382 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D113404	2021-11-10 14:53:52 +00:00
Tobias Gysi	0609eb1b32	[mlir][linalg] Remove padding from tiling options. Remove the padding options from the tiling options since padding is now implemented by a separate pattern/pass introduced in https://reviews.llvm.org/D112412. The revsion remove the tile-and-pad-tensors.mlir and replaces it with the pad.mlir that tests padding in isolation (without tiling). Similarly, hoist-padding.mlir is replaced by pad-and-hoist.mlir introduced in https://reviews.llvm.org/D112713. Depends On D112838 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D113382	2021-11-10 13:33:28 +00:00
Matthias Springer	c3eb967e2a	[mlir][linalg][bufferize] Bufferize ops via PreOrder traversal The existing PostOrder traversal with special rules for certain ops was complicated and had a bug. Switch to PreOrder traversal. Differential Revision: https://reviews.llvm.org/D113338	2021-11-10 18:51:39 +09:00
Matthias Springer	99ad2079d4	[mlir][linalg][bufferize] Fix buffer equivalence around scf.if ops Also extend the comments for aliasInfo and equivalenceInfo. Differential Revision: https://reviews.llvm.org/D113340	2021-11-10 18:33:08 +09:00
Matthias Springer	f74f09128b	[mlir][linalg][bufferize] Relax tensor.insert_slice conflict rules A tensor.insert_slice write does not conflict with a subsequent read of the source if the source is originating from a matching tensor.extract_slice. Differential Revision: https://reviews.llvm.org/D113446	2021-11-10 18:23:29 +09:00
Suraj Sudhir	82568021dd	[mlir][tosa] Spec v0.23 updates Add pad_const field to tosa.pad. Add builders to enable optional construction of pad_const in pad op. Update documentation of tosa.clamp to match spec wording. Signed-off-by: Suraj Sudhir <suraj.sudhir@arm.com> Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D113322	2021-11-08 10:13:54 -08:00
Tobias Gysi	1726c956ae	[mlir][linalg] Improve hoist padding buffer size computation. Adapt the Fourier Motzkin elimination to take into account affine computations happening outside of the cloned loop nest. Depends On D112713 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D112838	2021-11-08 12:02:57 +00:00
Tobias Gysi	9fbcad3298	[mlir][linalg] Improve the padding packing loop computation. The revision updates the packing loop search in hoist padding. Instead of considering all loops in the backward slice, we now compute a separate backward slice containing the index computations only. This modification ensures we do not add packing loops that are not used to index the packed buffer due to spurious dependencies. One instance where such spurious dependencies can appear is the extract slice operation introduced between the tile loops of a double tiling. Depends On D112412 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D112713	2021-11-08 10:20:33 +00:00
Shraiysh Vaishay	19a7e4729d	[MLIR][OpenMP] Added omp.sections and omp.section Added omp.sections and omp.section operation according to the section 2.8.1 of OpenMP Standard 5.0. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D110844	2021-11-06 19:27:35 +05:30
Aart Bik	2f0ee17017	[mlir][sparse] test for SIMD reduction chaining in consecutive vector loops Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D113197	2021-11-05 10:14:17 -07:00
Aart Bik	7373cabcda	[mlir][sparse] implement full reduction "scalarization" across loop nests The earlier reduction "scalarization" was only applied to a chain of innermost and for loops. This revision generalizes this to any nesting of for- and while-loops. This implies that reductions can be implemented with a lot less load and store operations. The chaining is implemented with a forest of yield statements (but not as bad as when we would also include the while-induction). Fixes https://bugs.llvm.org/show_bug.cgi?id=52311 Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D113078	2021-11-04 17:38:47 -07:00
not-jenni	07a029c057	Canonicalization for add to no-op if one of the inputs is zero Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D113207	2021-11-04 16:52:47 -07:00
Aart Bik	4aa9b39824	[mlir][sparse] reject sparsity annotation in "scalar" tensors Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D113152	2021-11-04 09:49:05 -07:00
Tobias Gysi	29c31cb79b	[mlir][linalg] Add support for transitive fusion. Extend fusion on tensors to fuse producers greedily. Reviewed By: nicolasvasilache, hanchung Differential Revision: https://reviews.llvm.org/D110262	2021-11-04 16:25:06 +00:00
Butygin	1cb13fddb9	[mlir] spirv: Add some atomic ops Differential Revision: https://reviews.llvm.org/D112812	2021-11-03 14:47:12 +03:00
Nicolas Vasilache	9c4971740b	[mlir][Linalg] Refactor vectorization of conv1d more aggressively. This better decouples transfer read/write from vector-only rewrite of conv. This form is close to ready to plop into a new vector.conv op and the vector.transfer operations to be generalized as part of generic vectorization once the properties ConvolutionOpInterface are inferred from the indexing maps. This also results in a nice perf boost in the dw == 1 cases. Differential revision: https://reviews.llvm.org/D112822	2021-11-03 08:18:01 +00:00
Nicolas Vasilache	7b09f157e1	[mlir][Linalg] Refactor conv vectorization to decouple memory from vector ops. This refactoring prepares conv1d vectorization for a future integration into the generic codegen path. Once transfer_read / transfer_write vectorization also supports sliding windows, the special pattern for conv can disappear. This will also likely need a vector.conv operation. Differential Revision: https://reviews.llvm.org/D112797	2021-11-03 08:03:40 +00:00
Nicolas Vasilache	885072820c	[mlir][Vector] Add a pattern to lower 2-D vector.transpose to shape_cast+shuffle. The 2-D case can be rewritten to generate quite fewer instructions and a single vector.shuffle which seems to provide a nice performance boost. Add this arrow to our quiver by exposing it with a new vector transform option. Differential Revision: https://reviews.llvm.org/D113062	2021-11-02 22:12:46 +00:00
Lei Zhang	7b615a87dc	[mlir][linalg] Rewrite `linalg.conv_2d_nhwc_hwcf` into 1-D We'd like to take a progressive approach towards Fconvolution op CodeGen, by 1) tiling it to fit compute hierarchy first, and then 2) tiling along window dimensions with size 1 to reduce the problem to be matmul-like. After that, we can 3) downscale high-D convolution ops to low-D by removing the size-1 window dimensions. The final step would be 4) vectorizing the low-D convolution op directly. We have patterns for 1), 2), and 4). This commit adds a pattern for 3) for `linalg.conv_2d_nhwc_hwcf` ops as a starter. Supporting other high-D convolution ops should be similar and mechanical. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D112928	2021-11-02 09:56:26 -04:00
thomasraoux	8a992b20db	[mlir][gpu] Add basic support to do elementwise ops on mma matrix type In order to support fusion with mma matrix type we need to be able to execute elementwise operations on them. This add an op to be able to support some basic elementwise operations. This is a is not a full solution as it only supports a limited scope or operations. Ideally we would want to be able to fuse with more kind of operations. Differential Revision: https://reviews.llvm.org/D112857	2021-11-01 11:51:19 -07:00
thomasraoux	77eafb8430	[mlir][nvvm] Generalize wmma ops to handle more types and shapes wmma intrinsics have a large number of combinations, ideally we want to be able to target all the different variants. To avoid a combinatorial explosion in the number of mlir op we use attributes to represent the different variation of load/store/mma ops. We also can generate with tablegen helpers to know which combinations are available. Using this we can avoid having too hardcode a path for specific shapes and can support more types. This patch also adds boiler plates for tf32 op support. Differential Revision: https://reviews.llvm.org/D112689	2021-11-01 10:27:26 -07:00
Ahmed Taei	813fa79c15	Don't drop in_bounds when vector-transfer-collapse-inner-most-dims When operand is a subview we don't infer in_bounds and some default cases (e.g case in the tests) will crash with `operand is NULL` when converting to LLVM Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D112772	2021-10-29 09:07:57 -07:00
Tobias Gysi	6638112b42	[mlir][linalg] Add padding pass to strategy passes. Add a strategy pass that pads and hoists after tiling and fusion. Depends On D112412 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D112480	2021-10-29 15:30:42 +00:00
Tobias Gysi	d0ec4a8ed9	[mlir][linalg] Add pad and hoist test pass. Adding a padding and hoisting pattern, a test pass, and tests. The patch prepares the split of tiling/fusion and padding. Depends On D112255 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D112412	2021-10-29 15:08:16 +00:00
wren romano	5389cdc8f6	[mlir][sparse] Adding dynamic-size support for sparse=>dense conversion Depends On D110790 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D112674	2021-10-28 16:56:18 -07:00
wren romano	28882b6575	[mlir][sparse] Implementing sparse=>dense conversion. Depends On D110882, D110883, D110884 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D110790	2021-10-28 15:27:35 -07:00
Eugene Zhulenev	627fa0b9a8	[mlir] MathApproximations: unroll virtual vectors into hardware vectors for ISA specific operation Reviewed By: cota Differential Revision: https://reviews.llvm.org/D112736	2021-10-28 12:52:04 -07:00
Uday Bondhugula	57b9b29649	[MLIR][LLVM] Add llvm.mlir.global_ctors/dtors and translation support Add llvm.mlir.global_ctors and global_dtors ops and their translation support to LLVM global_ctors/global_dtors global variables. Differential Revision: https://reviews.llvm.org/D112524	2021-10-28 18:09:34 +05:30
Shraiysh Vaishay	30bd11fab4	[MLIR][OpenMP] Fixed the missing inclusive clause in omp.wsloop and fix order clause This patch adds the inclusive clause (which was missed in previous reorganization - https://reviews.llvm.org/D110903) in omp.wsloop operation. Added a test for validating it. Also fixes the order clause, which was not accepting any values. It now accepts "concurrent" as a value, as specified in the standard. Reviewed By: kiranchandramohan, peixin, clementval Differential Revision: https://reviews.llvm.org/D112198	2021-10-28 14:18:05 +05:30
Matthias Springer	5b98e4ed16	[mlir][linalg][bufferize] Add analysis fuzzer option Analyze ops in a pseudo-random order to see if any assertions are triggered. Randomizing the order of analysis likely worsens the quality of the bufferization result (more out-of-place bufferizations). However, assertions should never fail, as that would indicate a problem with our implementation. Differential Revision: https://reviews.llvm.org/D112581	2021-10-27 17:37:56 +09:00
Shraiysh Vaishay	9fb52cb3f1	[MLIR][OpenMP] Added omp.atomic.read and omp.atomic.write This patch supports the atomic construct (read and write) following section 2.17.7 of OpenMP 5.0 standard. Also added tests and verifier for the same. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D111992	2021-10-27 14:05:44 +05:30
River Riddle	015192c634	[mlir:DialectConversion] Restructure how argument/target materializations get invoked The current implementation invokes materializations whenever an input operand does not have a mapping for the desired type, i.e. it requires materialization at the earliest possible point. This conflicts with goal of dialect conversion (and also the current documentation) which states that a materialization is only required if the materialization is supposed to persist after the conversion process has finished. This revision refactors this such that whenever a target materialization "might" be necessary, we insert an unrealized_conversion_cast to act as a temporary materialization. This allows for deferring the invocation of the user materialization hooks until the end of the conversion process, where we actually have a better sense if it's actually necessary. This has several benefits: * In some cases a target materialization hook is no longer necessary When performing a full conversion, there are some situations where a temporary materialization is necessary. Moving forward, these users won't need to provide any target materializations, as the temporary materializations do not require the user to provide materialization hooks. * getRemappedValue can now handle values that haven't been converted yet Before this commit, it wasn't well supported to get the remapped value of a value that hadn't been converted yet (making it difficult/impossible to convert multiple operations in many situations). This commit updates getRemappedValue to properly handle this case by inserting temporary materializations when necessary. Another code-health related benefit is that with this change we can move a majority of the complexity related to materializations to the end of the conversion process, instead of handling adhoc while conversion is happening. Differential Revision: https://reviews.llvm.org/D111620	2021-10-27 02:09:04 +00:00
Aart Bik	1e6ef0cfb0	[mlir][sparse] refine trait of sparse_tensor.convert Rationale: The currently used trait was demanding that all types are the same which is not true (since the sparse part may change and the dim sizes may be relaxed). This revision uses the correct trait and makes the rank match test explicit in the verify method. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D112576	2021-10-26 14:36:49 -07:00
Alexander Belyaev	96cee29762	[mlir] Allow polynomial approximations for N-d vectors. Polynomial approximation can be extented to support N-d vectors. N-dimensional vectors are useful when vectorizing operations on N-dimensional tiles. Before lowering to LLVM these vectors are usually unrolled or flattened to 1-dimensional vectors. Differential Revision: https://reviews.llvm.org/D112566	2021-10-26 20:50:00 +02:00
Amy Zhuang	b9ae741d3e	[mlir] Fix getVectorReductionOp 1.Combining kind min/max of Vector reduction op has been changed to minf/maxf, minsi/maxsi, and minui/maxui. Modify getVectorReductionOp accordingly. 2.Add min/max to supported reductions. Reviewed By: dcaballe, nicolasvasilache Differential Revision: https://reviews.llvm.org/D112246	2021-10-26 08:42:34 -07:00
Uday Bondhugula	41a8b46007	[MLIR] Fix AffineExpr getLargestKnownDivisor for ceildiv and floordiv Fix AffineExpr `getLargestKnownDivisor` for ceil/floor div cases. In these cases, nothing can be inferred on the divisor of the result. Add test case for `mod` as well. Differential Revision: https://reviews.llvm.org/D112523	2021-10-26 16:21:29 +05:30
Robert Suderman	58901a5a29	[mlir][tosa] Correct tosa.avg_pool2d for specification error Specification specified the output type for quantized average pool should be an i32. Only accumulator should be an i32, result type should match the input type. Caused in https://reviews.llvm.org/D111590 Reviewed By: sjarus, GMNGeoffrey Differential Revision: https://reviews.llvm.org/D112484	2021-10-25 14:41:16 -07:00
MaheshRavishankar	2f572818b0	[mlir][Linalg] Allow comprehensive bufferization to use callbacks for alloc/dealloc. Using callbacks for allocation/deallocation allows users to override the default. Also add an option to comprehensive bufferization pass to use `alloca` instead of `alloc`s. Note that this option is just for testing. The option to use `alloca` does not work well with the option to allow for returning memrefs.	2021-10-25 12:43:10 -07:00
Boian Petkantchin	f1b922188e	[MLIR][Math] Add erf to math dialect Add math.erf lowering to libm call. Add math.erf polynomial approximation. Reviewed By: silvas, ezhulenev Differential Revision: https://reviews.llvm.org/D112200	2021-10-25 18:30:17 +00:00
Aart Bik	1b15160ef3	[mlir][sparse] lower trivial tensor.cast on identical sparse tensors Even though tensor.cast is not part of the sparse tensor dialect, it may be used to cast static dimension sizes to dynamic dimension sizes for sparse tensors without changing the actual sparse tensor itself. Those cases should be lowered properly when replacing sparse tensor types with their opaque pointers. Likewise, no op sparse conversions are handled by this revision in a similar manner. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D112173	2021-10-25 10:30:19 -07:00
MaheshRavishankar	5fb46a9fa3	Revert "[mlir][Linalg] Allow comprehensive bufferization to use callbacks for alloc/dealloc." This reverts commit `c86f218fe4`. Revert because it causes build failure.	2021-10-25 08:57:53 -07:00
MaheshRavishankar	c86f218fe4	[mlir][Linalg] Allow comprehensive bufferization to use callbacks for alloc/dealloc. Using callbacks for allocation/deallocation allows users to override the default. Also add an option to comprehensive bufferization pass to use `alloca` instead of `alloc`s. Note that this option is just for testing. The option to use `alloca` does not work well with the option to allow for returning memrefs. Differential Revision: https://reviews.llvm.org/D112166	2021-10-25 08:50:25 -07:00
Jacques Pienaar	42e9af9e8f	[mlir] Rename to avoid overlap in accessor prefixing Split out renaming from D112383 into standalone change.	2021-10-24 18:17:09 -07:00
Emilio Cota	35553d452b	[mlir] Add polynomial approximation for vectorized math::Rsqrt This patch adds a polynomial approximation that matches the approximation in Eigen. Note that the approximation only applies to vectorized inputs; the scalar rsqrt is left unmodified. The approximation is protected with a flag since it emits an AVX2 intrinsic (generated via the X86Vector). This is the only reasonably clean way that I could find to generate the exact approximation that I wanted (i.e. an identical one to Eigen's). I considered two alternatives: 1. Introduce a Rsqrt intrinsic in LLVM, which doesn't exist yet. I believe this is because there is no definition of Rsqrt that all backends could agree on, since hardware instructions that implement it have widely varying degrees of precision. This is something that the standard could mandate, but Rsqrt is not part of IEEE754, so I don't think this option is feasible. 2. Emit fdiv(1.0, sqrt) with fast math flags to allow reciprocal transformations. Although portable, this doesn't allow us to generate exactly the code we want; it is the LLVM backend, and not MLIR, who controls what code is generated based on the target CPU. Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D112192	2021-10-23 04:56:12 -07:00
Mats Petersson	3f00e10bdd	[mlir][OpenMP]Support for modifiers in workshare loops Pass the modifiers from the Flang parser to FIR/MLIR workshare loop operation. Not yet supporting the SIMD modifier, which is a bit more work than just adding it to the list of modifiers, so will go in a separate patch. This adds a new field to the WsLoopOp. Also add test for dynamic WSLoop, checking that dynamic schedule calls the init and next functions as expected. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D111053	2021-10-22 14:19:33 +01:00
Matthias Springer	3bbc869e2e	[mlir][linalg][bufferize] Support scf::IfOp This commit adds support for scf::IfOp to comprehensive bufferization. Support is currently limited to cases where both branches yield tensors that bufferize to the same buffer. To keep the analysis simple, scf::IfOp are treated as memory writes for analysis purposes, even if no op inside any branch is writing. (scf::ForOps are handled in the same way.) Differential Revision: https://reviews.llvm.org/D111929	2021-10-22 10:12:55 +09:00
Mogball	516884f58b	[MLIR] Fix FloorDivSIOpConverter that was failing for index type after the arithmetic op refactor ConstantOp should be used instead of ConstantIntOp to be able to support index type. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D112191	2021-10-21 21:42:30 +00:00
thomasraoux	93d0ade17c	[mlir][linalg] Remove special case for contraction vectorization Handle contraction op like all the other generic op reductions. This simpifies the code. We now rely on contractionOp canonicalization to keep the same code quality. Differential Revision: https://reviews.llvm.org/D112171	2021-10-21 14:10:54 -07:00
thomasraoux	1d8cc45b0e	[mlir][vector] Add patterns to convert multidimreduce to vector.contract add several patterns that will simplify contraction vectorization in the future. With those canonicalizationns we will be able to remove the special case for contration during vectorization and rely on those transformations to avoid materizalizing broadcast ops. Differential Revision: https://reviews.llvm.org/D112121	2021-10-21 14:03:32 -07:00
Ahmed Taei	21f9e4a1ed	Avoid infinity arithmetics when computing exp approximations Otherwise this can result a poison value on some platforms see https://bugs.llvm.org/show_bug.cgi?id=51204 Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D112115	2021-10-21 10:09:18 -07:00
Nicolas Vasilache	203accf0bd	[mlir][Linalg] Improve conv vectorization for the stride==1 case. In the stride == 1 case, conv1d reads contiguous data along the input dimension. This can be advantageaously used to bulk memory transfers and compute while avoiding unrolling. Experimentally, this can yield speedups of up to 50%. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D112139	2021-10-21 15:18:28 +00:00
Matthias Springer	7a7e93f122	[mlir][linalg][bufferize] Avoid creating copies that are never read Differential Revision: https://reviews.llvm.org/D111956	2021-10-21 21:47:00 +09:00
Matthias Springer	c5501a7a5c	[mlir][linalg][bufferize] Eliminate InitTensorOps of InsertSliceOp sources An InitTensorOp is replaced with an ExtractSliceOp on the InsertSliceOp's destination. This optimization is applied after analysis and only to InsertSliceOps that were decided to bufferize inplace. Another analysis on the new ExtractSliceOp is needed after the rewrite. Differential Revision: https://reviews.llvm.org/D111955	2021-10-21 21:33:45 +09:00
Peixin-Qiao	b37e5187f2	[MLIR][OpenMP] Add support for ordered construct This patch supports the ordered construct in OpenMP dialect following Section 2.19.9 of the OpenMP 5.1 standard. Also lowering to LLVM IR using OpenMP IRBduiler. Lowering to LLVM IR for ordered simd directive is not supported yet since LLVM optimization passes do not support it for now. Reviewed By: kiranchandramohan, clementval, ftynse, shraiysh Differential Revision: https://reviews.llvm.org/D110015	2021-10-21 16:30:46 +08:00
Matthias Springer	9c55e718f5	[mlir][linalg][bufferize] Bufferize using PostOrder traversal This is required for bufferization of scf::IfOp, which is added in a subsequent commit. Some ops (scf::ForOp, TiledLoopOp) require PreOrder traversal to make sure that bbArgs are mapped before bufferizing the loop body. Differential Revision: https://reviews.llvm.org/D111924	2021-10-21 17:21:52 +09:00
Mehdi Amini	cb11ddb96c	Revert "[MLIR][OpenMP] Add support for ordered construct" This reverts commit `dc2be87ecf`. Seems like this broke all the CI bots.	2021-10-21 04:53:45 +00:00
Peixin-Qiao	dc2be87ecf	[MLIR][OpenMP] Add support for ordered construct This patch supports the ordered construct in OpenMP dialect following Section 2.19.9 of the OpenMP 5.1 standard. Also lowering to LLVM IR using OpenMP IRBduiler. Lowering to LLVM IR for ordered simd directive is not supported yet since LLVM optimization passes do not support it for now. Reviewed By: kiranchandramohan, clementval, ftynse, shraiysh Differential Revision: https://reviews.llvm.org/D110015	2021-10-21 09:16:04 +08:00
Aart Bik	bd5494d127	[mlir][sparse] make index type explicit in public API of support library The current implementation used explicit index->int64_t casts for some, but not all instances of passing values of type "index" in and from the sparse support library. This revision makes the situation more consistent by using new "index_t" type at all such places (which allows for less trivial casting in the generated MLIR code). Note that the current revision still assumes that "index" is 64-bit wide. If we want to support targets with alternative "index" bit widths, we need to build the support library different. But the current revision is a step forward by making this requirement explicit and more visible. Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D112122	2021-10-20 12:46:31 -07:00
Ahmed S. Taei	a3dd4e7770	Drop transfer_read inner most unit dimensions Add a pattern to take a rank-reducing subview and drop inner most contiguous unit dim. This is useful when lowering vector to backends with 1d vector types. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D111561	2021-10-20 19:27:04 +00:00
Shraiysh Vaishay	c4c7e06bd7	[MLIR][OpenMP] Shifted hint from CriticalOp to CriticalDeclareOp According to the OpenMP 5.0 standard, names and hints of critical operation are closely related. The following are the restrictions on them: - Unless the effect is as if `hint(omp_sync_hint_none)` was specified, the critical construct must specify a name. - If the hint clause is specified, each of the critical constructs with the same name must have a hint clause for which the hint-expression evaluates to the same value. These restrictions will be enforced by design if the hint expression is a part of the `omp.critical.declare` operation. - Any operation with no "name" will be considered to have `hint(omp_sync_hint_none)`. - All the operations with the same "name" will have the same hint value. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D112134	2021-10-20 21:36:09 +05:30
Nicolas Vasilache	6bb7d2474f	[mlir][Linalg] Add a first vectorization pattern for conv1d in NWCxWCF format. This revision uses the newly refactored StructuredGenerator to create a simple vectorization for conv1d_nwc_wcf. Note that the pattern is not specific to the op and is technically not even specific to the ConvolutionOpInterface (modulo minor details related to dilations and strides). The overall design follows the same ideas as the lowering of vector::ContractionOp -> vector::OuterProduct: it seeks to be minimally complex, composable and extensible while avoiding inference analysis. Instead, we metaprogram the maps/indexings we expect and we match against them. This is just a first stab and still needs to be evaluated for performance. Other tradeoffs are possible that should be explored. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D111894	2021-10-20 13:54:18 +00:00
Kojo Acquah	9c62bb55f4	Implementation of `ReshapeNoopOptimization` canonicalizer. This canonicalizer replaces reshapes of constant tensors that contain the updated shape (skipping the reshape operation). Differential Revision: https://reviews.llvm.org/D112038	2021-10-19 16:07:34 -07:00
bakhtiyar	f97f946839	Canonicalize max/min operations on integers. Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D112051	2021-10-19 05:25:59 -07:00
Shraiysh Vaishay	d576f45014	[MLIR][OpenMP] Added parseClauses Code reorganized in OpenMPDialect.cpp to have all functions corresponding to an operation together. Added parseClauses function to avoid code duplication while parsing clauses in OpenMP operations. Also added printers and verifiers for clauses, which are being used for multiple operations. Reviewed By: kiranchandramohan, peixin Differential Revision: https://reviews.llvm.org/D110903	2021-10-19 17:31:36 +05:30
Vladislav Vinogradov	e41ebbecf9	[mlir][RFC] Refactor layout representation in MemRefType The change is based on the proposal from the following discussion: https://llvm.discourse.group/t/rfc-memreftype-affine-maps-list-vs-single-item/3968 * Introduce `MemRefLayoutAttr` interface to get `AffineMap` from an `Attribute` (`AffineMapAttr` implements this interface). * Store layout as a single generic `MemRefLayoutAttr`. This change removes the affine map composition feature and related API. Actually, while the `MemRefType` itself supported it, almost none of the upstream can work with more than 1 affine map in `MemRefType`. The introduced `MemRefLayoutAttr` allows to re-implement this feature in a more stable way - via separate attribute class. Also the interface allows to use different layout representations rather than affine maps. For example, the described "stride + offset" form, which is currently supported in ASM parser only, can now be expressed as separate attribute. Reviewed By: ftynse, bondhugula Differential Revision: https://reviews.llvm.org/D111553	2021-10-19 12:31:15 +03:00
not-jenni	4ada6c2aaf	[mlir][tosa] Adds a canonicalization to the transpose op if the perms are a no op Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D112037	2021-10-18 16:30:53 -07:00
Aart Bik	9d1db3d4a1	[mlir][sparse] generalize sparse_tensor.convert on static/dynamic dimension sizes This revison lifts the artificial restriction on having exact matches between source and destination type shapes. A static size may become dynamic. We still reject changing a dynamic size into a static size to avoid the need for a runtime "assert" on the conversion. This revision also refactors some of the conversion code to share same-content buffers. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D111915	2021-10-18 13:54:03 -07:00
Eugene Zhulenev	bf32bb7e05	[mlir] Update approximation range for Tanh operation Use wider range for approximating Tanh to match results computed in Eigen with AVX. Reviewed By: cota Differential Revision: https://reviews.llvm.org/D112011	2021-10-18 10:57:31 -07:00
Ahmed Taei	b0c4aaff24	Allow only valid vector.shape_cast transitive folding When folding A->B->C => A->C only accept A->C that is valid shape cast Reviewed By: ThomasRaoux, nicolasvasilache Differential Revision: https://reviews.llvm.org/D111473	2021-10-18 07:57:55 -07:00
Matthias Springer	e7bb8dd929	[mlir][linalg][bufferize] Relax rules for extract_slice/insert_slice matching The rules were too restrictive, causing out-of-place bufferization when the result of two ExtractSliceOp is fed into an InsertSliceOp. Differential Revision: https://reviews.llvm.org/D111861	2021-10-16 17:08:47 +09:00
Jacques Pienaar	965ec6dbe7	[mlir] Add folder for shape.add	2021-10-15 17:30:17 -07:00
Aart Bik	b24788abd8	[mlir][sparse] implement sparse tensor init operation Next step towards supporting sparse tensors outputs. Also some minor refactoring of enum constants as well as replacing tensor arguments with proper buffer arguments (latter is required for more general sizes arguments for the sparse_tensor.init operation, as well as more general spares_tensor.convert operations later) Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D111771	2021-10-15 09:33:16 -07:00
Matthias Springer	7dd7078760	[mlir][linalg][bufferize] Handle scf::ForOp correctly in bufferizesToMemoryRead From the perspective of analysis, scf::ForOp is treated as a black box. Basic block arguments do not alias with their respective OpOperands on the ForOp, so they do not participate in conflict analysis with ops defined outside of the loop. However, bufferizesToMemoryRead and bufferizesToMemoryWrite on the scf::ForOp itself are used to determine how the scf::ForOp interacts with its surrounding ops. Differential Revision: https://reviews.llvm.org/D111775	2021-10-15 11:24:21 +09:00
Matthias Springer	d3cb6bf2d4	[mlir][linalg][bufferize] Rewrite conflict detection For each memory read, follow SSA use-def chains to find the op that produces the data being read (i.e., the most recent write). A memory write to an alias is a conflict if it takes places after the "most recent write" but before the read. This CL introduces two main changes: * There is a concise definition of a conflict. Given a piece of IR with InPlaceSpec annotations and a computes alias set, it is easy to compute whether this program has a conflict. No need to consider multiple cases such as "read of operand after in-place write" etc. * No need to check for clobbering. Differential Revision: https://reviews.llvm.org/D111287	2021-10-15 10:31:02 +09:00
thomasraoux	afad0cdf31	[mlir][vector] Refactor linalg vectorization for reductions Emit reduction during op vectorization instead of doing it when creating the transfer write. This allow us to not broadcast output arguments for reduction initial value. Differential Revision: https://reviews.llvm.org/D111825	2021-10-14 13:37:56 -07:00
Nicolas Vasilache	82dd977baf	[mlir][Linalg] Tighten canonicalization of InsertSliceOp that triggers infinite loop I am unclear this is reproducible with correct IR but atm the verifier for InsertSliceOp is not powerful enough and this triggers an infinite loop that is worth fixing independently. Differential Revision: https://reviews.llvm.org/D111812	2021-10-14 15:26:03 +00:00
Nicolas Vasilache	0eeaad3012	[mlir][Linalg] Fix insertion point in comprehensive bufferization	2021-10-14 15:24:09 +00:00
Tobias Gysi	a8f69be61f	[mlir][linalg] Expose flag to control nofold attribute when padding. Setting the nofold attribute enables packing an operand. At the moment, the attribute is set by default. The pack introduces a callback to control the flag. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D111718	2021-10-14 10:07:07 +00:00
Tobias Gysi	eaa52750ce	[mlir][linalg] Verify every LinalgOp has a body. After removing the last LinalgOps that have no region attached we can verify there is a region. The patch performs the following changes: - Move the SingleBlockImplicitTerminator trait further up the the structured op base class. - Adapt the LinalgOp verification since the trait only check if there is 0 or 1 block. - Introduce a getBlock method on the LinalgOp interface. - Access the LinalgOp body using either getBlock() or getBody() if the concrete operation type is known. This patch is a follow up to https://reviews.llvm.org/D111233. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D111393	2021-10-14 09:08:39 +00:00
Aart Bik	a652e5b53a	[mlir][sparse] emergency fix after constant -> arith.constant change Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D111743	2021-10-13 10:26:17 -07:00
Aart Bik	35517a251d	[mlir][sparse] add init sparse tensor operation This is the first step towards supporting general sparse tensors as output of operations. The init sparse tensor is used to materialize an empty sparse tensor of given shape and sparsity into a subsequent computation (similar to the dense tensor init operation counterpart). Example: %c = sparse_tensor.init %d1, %d2 : tensor<?x?xf32, #SparseMatrix> %0 = linalg.matmul ins(%a, %b: tensor<?x?xf32>, tensor<?x?xf32>) outs(%c: tensor<?x?xf32, #SparseMatrix>) -> tensor<?x?xf32, #SparseMatrix> Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D111684	2021-10-13 09:47:56 -07:00
xndcn	8c1553f0d7	[mlir][spirv] Add memory semantics verify for atomic operations Differential Revision: https://reviews.llvm.org/D111510	2021-10-14 00:00:55 +08:00
thomasraoux	cc83c2444f	[mlir][vector] Add canonicalization extract + splat Make canonicalization working on broadcast also work on splat op. Differential Revision: https://reviews.llvm.org/D111690	2021-10-13 08:08:46 -07:00
Mogball	a54f4eae0e	[MLIR] Replace std ops with arith dialect ops Precursor: https://reviews.llvm.org/D110200 Removed redundant ops from the standard dialect that were moved to the `arith` or `math` dialects. Renamed all instances of operations in the codebase and in tests. Reviewed By: rriddle, jpienaar Differential Revision: https://reviews.llvm.org/D110797	2021-10-13 03:07:03 +00:00
Weiwei Li	c0a6381e49	[mlir][SPIRVToLLVM] Solve ExecutionModeOp redefinition and add OpTypeSampledImage into SPV_Type 1. To avoid two ExecutionModeOp using the same name, adding the value of execution mode in name when converting to LLVM dialect. 2. To avoid syntax error in spv.OpLoad, add OpTypeSampledImage into SPV_Type. Reviewed by:antiagainst Differential revision:https://reviews.llvm.org/D111193	2021-10-13 10:03:25 +08:00
thomasraoux	aa71f487f3	[mlir] update new linalg vectorization tests after vectorization fix	2021-10-12 16:10:30 -07:00
thomasraoux	7c97e328b3	[mlir][linalg] Fix generic reduction vectorization We shouldn't broadcast the original value when doing reduction. Instead we compute the reduction and then combine it with the original value. Differential Revision: https://reviews.llvm.org/D111666	2021-10-12 15:46:04 -07:00
Diego Caballero	eeb09fd646	[mlir][Linalg] Enable vectorization of 'mul', 'and', 'or' and 'xor' reductions This patch adds support for vectorizing 'mul', 'and', 'or' anx 'xor' reductions to Linalg. Reviewed By: pifon2a, ThomasRaoux, aartbik Differential Revision: https://reviews.llvm.org/D111565	2021-10-12 21:08:23 +00:00
Diego Caballero	5c1d356c18	[mlir][Linalg] Enable vectorization of explicit broadcasts This patch teaches `isProjectedPermutation` and `inverseAndBroadcastProjectedPermutation` utilities to deal with maps representing an explicit broadcast, e.g., (d0, d1) -> (d0, 0). This extension is needed to enable vectorization of such explicit broadcast in Linalg. Reviewed By: pifon2a, nicolasvasilache Differential Revision: https://reviews.llvm.org/D111563	2021-10-12 21:08:22 +00:00
Rob Suderman	95e4b71519	[mlir][tosa] Fix tosa average_pool2d to linalg type issue Average pool assumed the same input/output type. Result type for integers is always an i32, should be updated appropriately. Reviewed By: GMNGeoffrey Differential Revision: https://reviews.llvm.org/D111590	2021-10-12 13:09:21 -07:00
Benjamin Kramer	f67d57c95f	[mlir][Shape] Add a pattern to turn extract from shape_of into tensor.dim If I remember correctly this wasn't done previously because dim used to be in the memref dialect. Differential Revision: https://reviews.llvm.org/D111651	2021-10-12 19:09:21 +02:00
Lei Zhang	519b350de0	[mlir][vector] Add folder for no-op InsertStridedSliceOp Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D111636	2021-10-12 11:41:35 -04:00
Nicolas Vasilache	b24c91fffc	[mlir][Vector][Bigfix] Fix vector transfer to store lowering to insert a proper ExtractOp Differential Revision: https://reviews.llvm.org/D111641	2021-10-12 13:28:12 +00:00
Nicolas Vasilache	753a67b5c9	[mlir][Linalg] Refactor and improve vectorization to add support for reduction into 0-d tensors. This revision takes advantage of the recently added support for 0-d transfers and vector.multi_reduction that return a scalar. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D111626	2021-10-12 12:47:36 +00:00
Lei Zhang	bdd37c9f49	[mlir][tensor] Add some folders for insert/extract slice ops * Fold extract_slice immediately after insert_slice. * Fold overlapping insert_slice. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D111439	2021-10-12 08:40:54 -04:00
Nicolas Vasilache	0c74b12a2e	[mlir][Vector] NFC - Add test to exercise lowering of vector.transfer to scf This revision also renames and moves some tests around. Differential Revision: https://reviews.llvm.org/D111606	2021-10-12 12:38:33 +00:00
Nicolas Vasilache	47f7938a94	[mlir][Vector] Add support for lowering 0-d transfers to load/store. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D111603	2021-10-12 12:35:19 +00:00
Nicolas Vasilache	67b10532c6	[mlir][Vector] Allow a 0-d for for vector transfer ops. This revision updates the op semantics, printer, parser and verifier to allow 0-d transfers. Until 0-d vectors are available, such transfers have a special form that transits through vector<1xt>. This is a stepping stone towards the longer term work of adding 0-d vectors and will help significantly reduce corner cases in vectorization. Transformations and lowerings do not yet support this form, extensions will follow. Differential Revision: https://reviews.llvm.org/D111559	2021-10-12 11:48:42 +00:00
Nicolas Vasilache	8f1650cb65	[mlir][Linalg] NFC - Refactor vector.broadcast op verification logic and make it available as a precondition in Linalg vectorization. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D111558	2021-10-12 11:35:34 +00:00
Nicolas Vasilache	31270eb165	[mlir][Vector] Let vector.multi_reduction reduce down to a scalar. vector.multi_reduction currently does not allow reducing down to a scalar. This creates corner cases that are hard to handle during vectorization. This revision extends the semantics and adds the proper transforms, lowerings and canonicalizations to allow lowering out of vector.multi_reduction to other abstractions all the way to LLVM. In a future, where we will also allow 0-d vectors, scalars will still be relevant: 0-d vector and scalars are not equivalent on all hardware. In the process, splice out the implementation patterns related to vector.multi_reduce into a new file. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D111442	2021-10-12 11:03:54 +00:00
Shraiysh Vaishay	7a79c6afea	[mlir][OpenMP] OpenMP Synchronization Hints stored as IntegerAttr `hint-expression` is an IntegerAttr, because it can be a combination of multiple values from the enum `omp_sync_hint_t` (Section 2.17.12 of OpenMP 5.0) Reviewed By: ftynse, kiranchandramohan Differential Revision: https://reviews.llvm.org/D111360	2021-10-12 11:01:19 +00:00
Aart Bik	849f016ce8	[mlir][sparse] accept affine subscripts in outer dimensions of dense memrefs This relaxes vectorization of dense memrefs a bit so that affine expressions are allowed in more outer dimensions. Vectorization of non unit stride references is disabled though, since this seems ineffective anyway. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D111469	2021-10-11 11:45:14 -07:00
Uday Bondhugula	b2217b36fe	[MLIR] Fix affine loop unroll corner case for full unroll Fix affine loop unroll for zero trip count loops. Add missing check. Differential Revision: https://reviews.llvm.org/D111375	2021-10-11 10:22:24 +05:30
Amy Zhuang	5ce368cfe2	[mlir] Vectorize induction variables 1. Add support to vectorize induction variables of loops that are not mapped to any vector dimension in SuperVectorize pass. 2. Fix a bug in getForInductionVarOwner. Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D111370	2021-10-09 12:40:24 -07:00
Matthias Springer	f8453ea75f	[mlir][linalg][bufferize] Rewrite "write into non-writable memory" detection The purpose of this revision is to make "write into non-writable memory" conflict detection easier to understand. The main idea is that there is a conflict in the case of inplace bufferization if: 1. Someone writes to (an alias of) opOperand, opResult or the to-be-bufferized op writes itself. 2. And, opOperand or opResult aliases a non-writable buffer. Differential Revision: https://reviews.llvm.org/D111379	2021-10-08 21:27:49 +09:00
Lei Zhang	4cd7ff6728	[mlir][linalg] Constant fold linalg.generic that are transposes This commit adds a pattern to perform constant folding on linalg generic ops which are essentially transposes. We see real cases where model importers may generate such patterns. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D110597	2021-10-08 08:09:13 -04:00
Eugene Zhulenev	e2a37bb540	[mlir] Add alignment option to constant tensor bufferization pass Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D111364	2021-10-08 03:17:20 -07:00
Tobias Gysi	8ed2e8e04f	[mlir][linalg] Retire Linalg ConvOp. The convolution op is one of the remaining hard coded Linalg operations that have no region attached. It got obsolete due to the OpDSL convolution operations. Removing it allows us to delete specialized code and tests that are not needed for the OpDSL counterparts that rely on the standard code paths. Test needed due to specialized implementations are removed. Tiling and fusion tests are replaced by variants using linalg.conv_2d. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D111233	2021-10-08 06:56:37 +00:00
Tobias Gysi	23800b05be	[mlir][linalg] Add loop interchange to CodegenStrategy. Add a loop interchange pass and integrate it with CodegenStrategy. This patch depends on https://reviews.llvm.org/D110728 and https://reviews.llvm.org/D110746. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110748	2021-10-08 06:39:22 +00:00
Tobias Gysi	1ebd197bc5	[mlir][linalg] Add generalization to CodegenStrategy. Add a generalization pass and integrate it with CodegenStrategy. This patch depends on https://reviews.llvm.org/D110728. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110746	2021-10-08 06:31:19 +00:00
MaheshRavishankar	4281946390	[mlir][Tensor] Add ReifyRankedShapedTypeOpInterface to tensor.extract_slice. Differential Revision: https://reviews.llvm.org/D111263	2021-10-07 17:10:35 -07:00
Amy Zhuang	5d001f58f2	[mlir] Fix a bug in Affine LICM. Currently Affine LICM checks iterOperands and does not hoist out any instruction containing iterOperands. We should check iterArgs instead. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D111090	2021-10-07 15:46:43 -07:00
Matthias Springer	6b1f653c94	[mlir][linalg][bufferize] tensor.cast may require a copy Differential Revision: https://reviews.llvm.org/D110806	2021-10-07 22:24:05 +09:00
Eugene Zhulenev	8276ac13e9	[mlir] Add alignment attribute to memref.global Revived https://reviews.llvm.org/D102435 Add alignment attribute to `memref.global` and propagate it to llvm global in memref->llvm lowering Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D111309	2021-10-07 06:21:57 -07:00
Adrian Kuegel	2bb208ddfd	[mlir] Don't allow dynamic extent tensor types for ConstShapeOp. ConstShapeOp has a constant shape, so its type can always be static. We still allow it to have ShapeType though. Differential Revision: https://reviews.llvm.org/D111139	2021-10-07 10:56:16 +02:00
Tobias Gysi	3fe7fe4424	[mlir][linalg] Add unsigned min/max/cast function to OpDSL. Update OpDSL to support unsigned integers by adding unsigned min/max/cast signatures. Add tests in OpDSL and on the C++ side to verify the proper signed and unsigned operations are emitted. The patch addresses an issue brought up in https://reviews.llvm.org/D111170. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D111230	2021-10-07 06:27:20 +00:00
Alexandre Rames	fd9613324d	[MLIR] Rename Shape dialect's `join` to `meet`. For the type lattice, we (now) use the "less specialized or equal" partial order, leading to the bottom representing the empty set, and the top representing any type. This naming is more in line with the generally used conventions, where the top of the lattice is the full set, and the bottom of the lattice is the empty set. A typical example is the powerset of a finite set: generally, meet would be the intersection, and join would be the union. ``` top: {a,b,c} / \| \ {a,b} {a,c} {b,c} \| X X \| {a} { b } {c} \ \| / bottom: { } ``` This is in line with the examined lattice representations in LLVM: * lattice for `BitTracker::BitValue` in `Hexagon/BitTracker.h` * lattice for constant propagation in `HexagonConstPropagation.cpp` * lattice in `VarLocBasedImpl.cpp` * lattice for address space inference code in `InferAddressSpaces.cpp` Reviewed By: silvas, jpienaar Differential Revision: https://reviews.llvm.org/D110766	2021-10-06 09:41:33 -07:00
Tobias Gysi	a744c7e962	[mlir][linalg] Update OpDSL to use the newly introduced min and max ops. Implement min and max using the newly introduced std operations instead of relying on compare and select. Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D111170	2021-10-06 06:45:53 +00:00
Diego Caballero	eaf2588a51	[mlir][Linalg] Add support for min/max reduction vectorization in linalg.generic This patch extends Linalg core vectorization with support for min/max reductions in linalg.generic ops. It enables the reduction detection for min/max combiner ops. It also renames MIN/MAX combining kinds to MINS/MAXS to make the sign explicit for floating point and signed integer types. MINU/MAXU should be introduce din the future for unsigned integer types. Reviewed By: pifon2a, ThomasRaoux Differential Revision: https://reviews.llvm.org/D110854	2021-10-05 22:47:20 +00:00
Lei Zhang	7a89444cd9	[mlir][spirv] Add ops and patterns for lowering standard max/min ops Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D111143	2021-10-05 14:27:32 -04:00
Aart Bik	16b8f4ddae	[mlir][sparse] add a "release" operation to sparse tensor dialect We have several ways to materialize sparse tensors (new and convert) but no explicit operation to release the underlying sparse storage scheme at runtime (other than making an explicit delSparseTensor() library call). To simplify memory management, a sparse_tensor.release operation has been introduced that lowers to the runtime library call while keeping tensors, opague pointers, and memrefs transparent in the initial IR. Note There is obviously some tension between the concept of immutable tensors and memory management methods. This tension is addressed by simply stating that after the "release" call, no further memref related operations are allowed on the tensor value. We expect the design to evolve over time, however, and arrive at a more satisfactory view of tensors and buffers eventually. Bug: http://llvm.org/pr52046 Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D111099	2021-10-05 09:35:59 -07:00
Nicolas Vasilache	af9dce18bf	[mlir][Linalg] Allow operand-less scf::ExecuteRegionOp to encapsulate scf::YieldOp These are considered noops. Buferization will still fail on scf.execute_region which yield values. This is used to make comprehensive bufferization interoperate better with external clients. Differential Revision: https://reviews.llvm.org/D111130	2021-10-05 11:34:53 +00:00
Alex Zinenko	01d696e563	[mlir] rename the "packing" flag of linalg.pad_tensor to "nofold" The discussion in https://reviews.llvm.org/D110425 demonstrated that "packing" may be a confusing term to define the behavior of this op in presence of the attribute. Instead, indicate the intended effect of preventing the folder from being applied. Reviewed By: nicolasvasilache, silvas Differential Revision: https://reviews.llvm.org/D111046	2021-10-04 21:28:11 +02:00
Nicolas Vasilache	fab634b4e2	[mlir] Tighten strided layout specification. Clarify that the strided layout specification is represented by a single semi-affine map. Differential Revision: https://reviews.llvm.org/D110921	2021-10-04 10:37:05 +00:00
Tobias Gysi	32a7d60516	[mli][linalg] Change tensor size in unit test (NFC). As a follow up to https://reviews.llvm.org/D110849, adapt the input tensor size to match the iteration space. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D110906	2021-10-04 06:43:35 +00:00
Tobias Gysi	bf28849745	[mlir][linalg] Retire PoolingMaxOp/PoolingMinOp/PoolingSumOp. The pooling ops are among the last remaining hard coded Linalg operations that have no region attached. They got obsolete due to the OpDSL pooling operations. Removing them allows us to delete specialized code and tests that are not needed for the OpDSL counterparts that rely on the standard code paths. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110909	2021-10-01 13:51:56 +00:00
Uday Bondhugula	08b63db8bb	[MLIR][GPU] Add GPU launch op support for dynamic shared memory Add support for dynamic shared memory for GPU launch ops: add an optional operand to gpu.launch and gpu.launch_func ops to specify the amount of "dynamic" shared memory to use. Update lowerings to connect this operand to the GPU runtime. Differential Revision: https://reviews.llvm.org/D110800	2021-10-01 16:46:07 +05:30
Lei Zhang	cb2e651800	[mlir][linalg] Fix incorrect bound calculation for tiling conv For convolution, the input window dimension's access affine map is of the form `(d0 * s0 + d1)`, where `d0`/`d1` is the output/ filter window dimension, and `s0` is the stride. When tiling, https://reviews.llvm.org/D109267 changed how the way dimensions are acquired. Instead of directly querying using `.dim` ops on the original convolution op, we now get it by applying the access affine map to the loop upper bounds. This is fine for dimensions having single-dimension affine maps, like matmul, but not for convolution input. It will cause incorrect compuation and out of bound. A concrete example, say we have 1x225x225x3 (NHWC) input, 3x3x3x32 (HWCF) filter, and 1x112x112x3 (NHWC) output with stride 2, (112 2 + 3) would be 227, which is different from the correct input window dimension size 225. Instead, we should first calculate the max indices for each loop, and apply the affine map to them, and then plus one to get the dimension size. Note this makes no difference for matmul-like ops given they will have `d0 - 1 + 1` effectively. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110849	2021-09-30 13:50:57 -04:00
Stella Laurenzo	267bb194f3	[mlir] Remove old "tc" linalg ods generator. * This could have been removed some time ago as it only had one op left in it, which is redundant with the new approach. * `matmul_i8_i8_i32` (the remaining op) can be trivially replaced by `matmul`, which natively supports mixed precision. Differential Revision: https://reviews.llvm.org/D110792	2021-09-30 16:30:06 +00:00
Matthias Springer	27451a05ed	[mlir][vector] Fold transfer ops and tensor.extract/insert_slice. * Fold vector.transfer_read and tensor.extract_slice. * Fold vector.transfer_write and tensor.insert_slice. Differential Revision: https://reviews.llvm.org/D110627	2021-09-30 09:28:00 +09:00
Nicolas Vasilache	92ea624a13	[mlir][Linalg] Rewrite CodegenStrategy to populate a pass pipeline. This revision retires a good portion of the complexity of the codegen strategy and puts the logic behind pass logic. Differential revision: https://reviews.llvm.org/D110678	2021-09-29 13:35:45 +00:00
bakhtiyar	bdde959533	Remove unnecessary async group creates and awaits. Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D110605	2021-09-28 14:52:08 -07:00
Amy Zhuang	7ab14b8886	[mlir] Unroll-and-jam loops with iter_args. Unroll-and-jam currently doesn't work when the loop being unroll-and-jammed or any of its inner loops has iter_args. This patch modifies the unroll-and-jam utility to support loops with iter_args. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D110085	2021-09-28 14:13:27 -07:00
thomasraoux	b12e4c17e0	[mlir] Fix bug in FoldSubview with rank reducing subview Fix how we calculate the new permutation map of the transfer ops. Differential Revision: https://reviews.llvm.org/D110638	2021-09-28 13:18:29 -07:00
Alexander Belyaev	9fb57c8c1d	[mlir] Add min/max operations to Standard. [RFC: Add min/max ops](https://llvm.discourse.group/t/rfc-add-min-max-operations/4353) I was following the naming style for Arith dialect in https://reviews.llvm.org/D110200, i.e. similar to DivSIOp and DivUIOp I defined MaxSIOp, MaxUIOp. When Arith PR is landed, I will migrate these ops as well. Differential Revision: https://reviews.llvm.org/D110540	2021-09-28 09:40:22 +02:00
Tobias Gysi	d20d0e145d	[mlir][linalg] Finer-grained padding control. Adapt the signature of the PaddingValueComputationFunction callback to either return the padding value or failure to signal padding is not desired. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110572	2021-09-27 19:21:37 +00:00
Aart Bik	ec97a205c3	[mlir][sparse] preserve zero-initialization for materializing buffers This revision makes sure that when the output buffer materializes locally (in contrast with the passing in of output tensors either in-place or not in-place), the zero initialization assumption is preserved. This also adds a bit more documentation on our sparse kernel assumption (viz. TACO assumptions). Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D110442	2021-09-27 11:22:05 -07:00
Bixia Zheng	fbd5821c6f	Implement the conversion from sparse constant to sparse tensors. The sparse constant provides a constant tensor in coordinate format. We first split the sparse constant into a constant tensor for indices and a constant tensor for values. We then generate a loop to fill a sparse tensor in coordinate format using the tensors for the indices and the values. Finally, we convert the sparse tensor in coordinate format to the destination sparse tensor format. Add tests. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D110373	2021-09-27 09:47:29 -07:00
Eugene Zhulenev	92db09cde0	[mlir] AsyncRuntime: use int64_t for ref counting operations Workaround for SystemZ ABI problem: https://bugs.llvm.org/show_bug.cgi?id=51898 Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D110550	2021-09-27 07:55:01 -07:00
Matthias Springer	ffdf0a370d	[mlir][vector] Fix bug in vector-transfer-full-partial-split When splitting with linalg.copy, cannot write into the destination alloc directly. Instead, write into a subview of the alloc. Differential Revision: https://reviews.llvm.org/D110512	2021-09-27 18:12:17 +09:00
Lei Zhang	b45476c94c	[mlir][tosa] Do not fold transpose with quantized types For such cases, the type of the constant DenseElementsAttr is different from the transpose op return type. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D110446	2021-09-24 16:57:55 -04:00
River Riddle	aca9bea199	[mlir:MemRef] Move DmaStartOp/DmaWaitOp to ODS These are among the last operations still defined explicitly in C++. I've tried to keep this commit as NFC as possible, but these ops definitely need a non-NFC cleanup at some point. Differential Revision: https://reviews.llvm.org/D110440	2021-09-24 19:35:28 +00:00
Lei Zhang	e325ebb9c7	[mlir][tosa] Add some transpose folders * If the input is a constant splat value, we just need to reshape it. * If the input is a general constant with one user, we can also constant fold it, without bloating the IR. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D110439	2021-09-24 15:25:14 -04:00
Alex Zinenko	5988a3b7a0	[mlir] Linalg: ensure tile-and-pad always creates padding as requested Initially, the padding transformation and the related operation were only used to guarantee static shapes of subtensors in tiled operations. The transformation would not insert the padding operation if the shapes were already static, and the overall code generation would actively remove such "noop" pads. However, this transformation can be also used to pack data into smaller tensors and marshall them into faster memory, regardless of the size mismatches. In context of expert-driven transformation, we should assume that, if padding is requested, a potentially padded tensor must be always created. Update the transformation accordingly. To do this, introduce an optional `packing` attribute to the `pad_tensor` op that serves as an indication that the padding is an intentional choice (as opposed to side effect of type normalization) and should be left alone by cleanups. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110425	2021-09-24 18:40:13 +02:00
Alex Zinenko	3f89e339bb	[mlir] add pad_tensor(tensor.cast) -> pad_tensor canonicalizer This canonicalization pattern complements the tensor.cast(pad_tensor) one in propagating constant type information when possible. It contributes to the feasibility of pad hoisting. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110343	2021-09-24 12:03:47 +02:00
Matthias Springer	f3f25ffc04	[mlir][linalg] Fix result type in FoldSourceTensorCast * Do not discard static result type information that cannot be inferred from lower/upper padding. * Add optional argument to `PadTensorOp::inferResultType` for specifying known result dimensions. Differential Revision: https://reviews.llvm.org/D110380	2021-09-24 16:47:18 +09:00
Matthias Springer	2190f8a8b1	[mlir][linalg] Support tile+peel with TiledLoopOp Only scf.for was supported until now. Differential Revision: https://reviews.llvm.org/D110220	2021-09-24 10:23:31 +09:00
Matthias Springer	8dc16ba8d2	[mlir][linalg] Merge all tiling passes into a single one. Passes such as `linalg-tile-to-tiled-loop` are merged into `linalg-tile`. Differential Revision: https://reviews.llvm.org/D110214	2021-09-24 10:16:46 +09:00
Aart Bik	a924fcc7c3	[mlir][sparse] add sparse kernels test to sparse compiler test suite This test makes sure kernels map to efficient sparse code, i.e. all compressed for-loops, no co-iterating while loops. In addition, this revision removes the special constant folding inside the sparse compiler in favor of Mahesh' new generic linalg folding. Thanks! NOTE: relies on Mahesh fix, which needs to be rebased first Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D110001	2021-09-22 14:56:39 -07:00
MaheshRavishankar	a40a08ed98	[mlir][Linalg] Teach constant -> generic op fusion to handle scalar constants. The current folder of constant -> generic op only handles splat constants. The same logic holds for scalar constants. Teach the pattern to handle such cases. Differential Revision: https://reviews.llvm.org/D109982	2021-09-22 13:41:47 -07:00
Aart Bik	5da21338bc	[mlir][sparse] generalize reduction support in sparse compiler Now not just SUM, but also PRODUCT, AND, OR, XOR. The reductions MIN and MAX are still to be done (also depends on recognizing these operations in cmp-select constructs). Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D110203	2021-09-22 12:36:46 -07:00
Tobias Gysi	8b5236def5	[mlir][linalg] Simplify slice dim computation for fusion on tensors (NFC). Compute the tiled producer slice dimensions directly starting from the consumer not using the producer at all. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110147	2021-09-21 15:09:46 +00:00
Nicolas Vasilache	101d017a64	[mlir][Linalg] Revisit heuristic ordering of tensor.insert_slice in comprehensive bufferize. It was previously assumed that tensor.insert_slice should be bufferized first in a greedy fashion to avoid out-of-place bufferization of the large tensor. This heuristic does not hold upon further inspection. This CL removes the special handling of such ops and adds a test that exhibits better behavior and appears in real use cases. The only test adversely affected is an artificial test which results in a returned memref: this pattern is not allowed by comprehensive bufferization in real scenarios anyway and the offending test is deleted. Differential Revision: https://reviews.llvm.org/D110072	2021-09-21 14:22:45 +00:00
Nicolas Vasilache	0d2c54e851	[mlir][Linalg] Revisit RAW dependence interference in comprehensive bufferize. Previously, comprehensive bufferize would consider all aliasing reads and writes to the result buffer and matching operand. This resulted in spurious dependences being considered and resulted in too many unnecessary copies. Instead, this revision revisits the gathering of read and write alias sets. This results in fewer alloc and copies. An exhaustive test cases is added that considers all possible permutations of `matmul(extract_slice(fill), extract_slice(fill), ...)`.	2021-09-21 14:22:22 +00:00
Morten Borup Petersen	032cb1650f	[MLIR][SCF] Add for-to-while loop transformation pass This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop. Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable. This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable). Differential Revision: https://reviews.llvm.org/D108454	2021-09-21 09:09:54 +01:00
River Riddle	4f21152af1	[mlir] Tighten verification of SparseElementsAttr SparseElementsAttr currently does not perform any verfication on construction, with the only verification existing within the parser. This revision moves the parser verification to SparseElementsAttr, and also adds additional verification for when a sparse index is not valid. Differential Revision: https://reviews.llvm.org/D109189	2021-09-21 01:57:42 +00:00
MaheshRavishankar	4cf9bf6c9f	[mlir][MemRef] Compute unused dimensions of a rank-reducing subviews using strides as well. For `memref.subview` operations, when there are more than one unit-dimensions, the strides need to be used to figure out which of the unit-dims are actually dropped. Differential Revision: https://reviews.llvm.org/D109418	2021-09-20 11:05:30 -07:00
MaheshRavishankar	0b33890f45	[mlir][Linalg] Add ConvolutionOpInterface. Add an interface that allows grouping together all covolution and pooling ops within Linalg named ops. The interface currently - the indexing map used for input/image access is valid - the filter and output are accessed using projected permutations - that all loops are charecterizable as one iterating over - batch dimension, - output image dimensions, - filter convolved dimensions, - output channel dimensions, - input channel dimensions, - depth multiplier (for depthwise convolutions) Differential Revision: https://reviews.llvm.org/D109793	2021-09-20 10:41:10 -07:00
Mehdi Amini	5edd79fc97	Revert "[MLIR][SCF] Add for-to-while loop transformation pass" This reverts commit `644b55d57e`. The added test is failing the bots.	2021-09-20 17:21:59 +00:00
Tobias Gysi	7be28d82b4	[mlir][linalg] Add IndexOp support to fusion on tensors. This revision depends on https://reviews.llvm.org/D109761 and https://reviews.llvm.org/D109766. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D109774	2021-09-20 15:59:35 +00:00
Morten Borup Petersen	644b55d57e	[MLIR][SCF] Add for-to-while loop transformation pass This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop. Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable. This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable). Differential Revision: https://reviews.llvm.org/D108454	2021-09-20 16:57:50 +01:00
Tobias Gysi	6db928b8f3	[mlir][linalg] Fusion on tensors. Add a new version of fusion on tensors that supports the following scenarios: - support input and output operand fusion - fuse a producer result passed in via tile loop iteration arguments (update the tile loop iteration arguments) - supports only linalg operations on tensors - supports only scf::for - cannot add an output to the tile loop nest The LinalgTileAndFuseOnTensors pass tiles the root operation and fuses its producers. Reviewed By: nicolasvasilache, mravishankar Differential Revision: https://reviews.llvm.org/D109766	2021-09-20 14:45:34 +00:00
KareemErgawy-TomTom	bdcf4b9b96	[MLIR][Linalg] Make detensoring cost-model more flexible. So far, the CF cost-model for detensoring was limited to discovering pure CF structures. This means, if while discovering the CF component, the cost-model found any op that is not detensorable, it gives up on detensoring altogether. This patch makes it a bit more flexible by cleaning-up the detensorable component from non-detensorable ops without giving up entirely. Reviewed By: silvas Differential Revision: https://reviews.llvm.org/D109965	2021-09-20 10:21:31 +02:00
Uday Bondhugula	57eda9becc	[MLIR][GPU] Add constant propagator for gpu.launch op Add a constant propagator for gpu.launch op in cases where the grid/thread IDs can be trivially determined to take a single constant value of zero. Differential Revision: https://reviews.llvm.org/D109994	2021-09-18 12:02:46 +05:30
Aart Bik	d4e16171e8	[mlir][sparse] add dce test for all sparse tensor ops Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D109992	2021-09-17 13:03:42 -07:00
thomasraoux	08f0cb7719	[mlir] Prevent crash in DropUnitDim pattern due to tensor with encoding Differential Revision: https://reviews.llvm.org/D109984	2021-09-17 12:03:16 -07:00
thomasraoux	36aac53b36	[mlir][linalg] Extend drop unit dim pattern to all cases of reduction Even with all parallel loops reading the output value is still allowed so we don't have to handle reduction loops differently. Differential Revision: https://reviews.llvm.org/D109851	2021-09-17 10:09:57 -07:00
thomasraoux	416679615d	[mlir] Linalg hoisting should ignore uses outside the loop Differential Revision: https://reviews.llvm.org/D109859	2021-09-17 10:06:57 -07:00
Aart Bik	b1d44e5902	[mlir][sparse] add affine subscripts to sparse compilation pass This enables the sparsification of more kernels, such as convolutions where there is a x(i+j) subscript. It also enables more tensor invariants such as x(1) or other affine subscripts such as x(i+1). Currently, we reject sparsity altogether for such tensors. Despite this restriction, however, we can already handle a lot more kernels with compound subscripts for dense access (viz. convolution with dense input and sparse filter). Some unit tests and an integration test demonstrate new capability. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D109783	2021-09-15 20:28:04 -07:00
Rob Suderman	1ac2d195ec	[mlir][linalg] Add canonicalizers for depthwise conv There are two main versions of depthwise conv depending whether the multiplier is 1 or not. In cases where m == 1 we should use the version without the multiplier channel as it can perform greater optimization. Add lowering for the quantized/float versions to have a multiplier of one. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D108959	2021-09-15 14:09:15 -07:00
Simon Camphausen	1b79efdc72	[mlir] Fix printing of EmitC attrs/types with escape characters Attributes and types were not escaped when printing. Reviewed By: jpienaar, marbre Differential Revision: https://reviews.llvm.org/D109143	2021-09-15 18:15:38 +00:00
Nicolas Vasilache	96ec0ff2b7	[mlir][Linalg] Revisit insertion points in comprehensive bufferization. This revision fixes a corner case that could appear due to incorrect insertion point behavior in comprehensive bufferization. Differential Revision: https://reviews.llvm.org/D109830	2021-09-15 18:11:38 +00:00
Nicolas Vasilache	6fe77b1051	[mlir][Linalg] Fail comprehensive bufferization if a memref is returned. Summary: Reviewers: Subscribers: Differential revision: https://reviews.llvm.org/D109824	2021-09-15 15:11:17 +00:00
Tobias Gysi	44a889778c	[mlir][linalg] Fold ExtractSliceOps during tiling. Add the makeComposedExtractSliceOp method that creates an ExtractSliceOp and folds chains of ExtractSliceOps by computing the sum of their offsets and by multiplying their strides. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D109601	2021-09-14 11:43:52 +00:00
Matthias Springer	62883459cd	[mlir][linalg] makeTiledShape: No affine.min if tile size == 1 This improves codegen (more static type information) with `scalarize-dynamic-dims`. Differential Revision: https://reviews.llvm.org/D109415	2021-09-14 10:48:20 +09:00
Matthias Springer	fb1def9c66	[mlir][linalg] New tiling option: Scalarize dynamic dims This tiling option scalarizes all dynamic dimensions, i.e., it tiles all dynamic dimensions by 1. This option is useful for linalg ops with partly dynamic tensor dimensions. E.g., such ops can appear in the partial iteration after loop peeling. After scalarizing dynamic dims, those ops can be vectorized. Differential Revision: https://reviews.llvm.org/D109268	2021-09-14 10:40:50 +09:00
Matthias Springer	8faf35c0a5	[mlir][linalg] Add scf.for loop peeling to codegen strategy Only scf.for loops are supported at the moment. linalg.tiled_loop support will be added in a subsequent commit. Only static tensor sizes are supported. Loops for dynamic tensor sizes can be peeled, but the generated code is not optimal due to a missing canonicalization pattern. Differential Revision: https://reviews.llvm.org/D109043	2021-09-14 10:35:01 +09:00
Matthias Springer	a4a654d301	[mlir][linalg] TiledLoopOp peeling: Do not peel partial iterations Extend the unit test with an option for skipping partial iterations during loop peeling. Differential Revision: https://reviews.llvm.org/D109640	2021-09-14 10:01:46 +09:00
Nicolas Vasilache	181d18ef53	[mlir][Linalg] Insert static buffers as high as possible during ComprehensiveBufferization. This revision allows hoisting static alloc/dealloc pairs as high as possible during ComprehensiveBufferization. This also aligns such allocated buffers to 128B by default. This change exhibited some issues wrt insertion points and a missing copy that are also fixed in this revision; tests are updated accordingly. Differential Revision: https://reviews.llvm.org/D109684	2021-09-13 15:59:03 +00:00
Nicolas Vasilache	b01d223faf	[mlir][Linalg] Use reify for padded op shape derivation. Previously, we would insert a DimOp and rely on later canonicalizations. Unfortunately, reifyShape kind of rewrites are not canonicalizations anymore. This introduces undesirable pass dependencies. Instead, immediately reify the result shape and avoid the DimOp altogether. This is akin to a local folding, which avoids introducing more reliance on `-resolve-shaped-type-result-dims` (similar to compositions of `affine.apply` by construction to avoid chains of size > 1). It does not completely get rid of the reliance on the pass as the process is merely local: calling the pass may still be necessary for global effects. Indeed, one of the tests still requires the pass. Differential Revision: https://reviews.llvm.org/D109571	2021-09-13 11:54:59 +00:00
Rob Suderman	b0532286fe	[mlir][tosa] Add shape inference for tosa.while Tosa.while shape inference requires repeatedly running shape inference across the body of the loop until the types become static as we do not know the number of iterations required by the loop body. Once the least specific arguments are known they are propagated to both regions. To determine the final end type, the least restrictive types are determined from all yields. Differential Revision: https://reviews.llvm.org/D108801	2021-09-10 13:11:53 -07:00
Stephan Herhut	5e6c170b3f	[mlir][linalg] Fix bufferize pattern to allow unknown operations in body of generic The original version of the bufferization pattern for linalg.generic would manually clone operations within the region to the bufferized clone of the operation. This triggers legality requirements on those operations in the conversion infra. Instead, this now uses the rewriter to inline the region instead, avoiding those legality requirements. Differential Revision: https://reviews.llvm.org/D109581	2021-09-10 13:37:42 +02:00
Matthias Springer	0f3544d185	[mlir][scf] Loop peeling: Use scf.for for partial iteration Generate an scf.for instead of an scf.if for the partial iteration. This is for consistency reasons: The peeling of linalg.tiled_loop also uses another loop for the partial iteration. Note: Canonicalizations patterns may rewrite partial iterations to scf.if afterwards. Differential Revision: https://reviews.llvm.org/D109568	2021-09-10 19:07:09 +09:00
Nicolas Vasilache	5f1a1af4bf	[mlir][Linalg] Properly order extract_slice traversal in comprehensive bufferization This revision fixes the traversal order of extract_slice during the inplace analysis. It was previously thought that such ops could be analyzed at the very end. This is unfortunately not true as the AliasInfo for dependents of these ops need to be updated. This change allows the aliases introduced by the bufferization of extract_slice to be properly propagated. Differential Revision: https://reviews.llvm.org/D109519	2021-09-10 07:10:06 +00:00
Aart Bik	066d786ce0	[mlir][sparse] add folding to sparse_tensor.convert folds conversion between identical types (with tests) Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D109545	2021-09-09 15:45:19 -07:00
Matthias Springer	c7d569b8f7	[mlir][scf] Fold dim(scf.for) to dim(iter_arg) Fold dim ops of scf.for results to dim ops of the respective iter args if the loop is shape preserving. Differential Revision: https://reviews.llvm.org/D109430	2021-09-09 13:47:13 +09:00
Matthias Springer	e2c8fcb9d0	[mlir][linalg] Fold dim(linalg.tiled_loop) to dim(output_arg) Fold dim ops of linalg.tiled_loop results to dim ops of the respective iter args if the loop is shape preserving. Differential Revision: https://reviews.llvm.org/D109431	2021-09-09 13:37:28 +09:00
Matthias Springer	f7137da174	[mlir][linalg] Fix dim(iter_arg) canonicalization Run a small analysis to see if the runtime type of the iter_arg is changing. Fold only if the runtime type stays the same. (Same as `DimOfIterArgFolder` in SCF.) Differential Revision: https://reviews.llvm.org/D109299	2021-09-09 12:13:05 +09:00
Matthias Springer	c95a7246a3	[mlir][linalg] Tiling: Use loop ub in extract_slice size computation if possible When tiling a LinalgOp, extract_slice/insert_slice pairs are inserted. To avoid going out-of-bounds when the tile size does not divide the shape size evenly (at the boundary), AffineMin ops are inserted. Some ops have assumptions regarding the dimensions of inputs/outputs. E.g., in a `A * B` matmul, `dim(A, 1) == dim(B, 0)`. However, loop bounds use either `dim(A, 1)` or `dim(B, 0)`. With this change, AffineMin ops are expressed in terms of loop bounds instead of tensor sizes. (Both have the same runtime value.) This simplifies canonicalizations. Differential Revision: https://reviews.llvm.org/D109267	2021-09-09 11:06:22 +09:00
Chris Lattner	42431b8207	[tests] Make testsuite more resilient to "order of constant" changes. NFC.	2021-09-08 10:10:10 -07:00
Matthias Springer	c57c4f888c	[mlir][linalg] linalg.tiled_loop peeling Differential Revision: https://reviews.llvm.org/D108270	2021-09-07 09:50:08 +09:00
Eugene Zhulenev	fd52b4357a	[mlir] Async: check awaited operand error state after sync await Previously only await inside the async function (coroutine after lowering to async runtime) would check the error state Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D109229	2021-09-04 05:00:17 -07:00
Loren Maggiore	361458b1ce	[mlir] create gpu memset op Create a gpu memset op and corresponding CUDA and ROCm wrappers. Reviewed By: herhut, lorenrose1013 Differential Revision: https://reviews.llvm.org/D107548	2021-09-04 08:13:04 +02:00
Mehdi Amini	78accf9f35	Make LLVM Linkage a first class attribute instead of using an integer attribute This makes the IR more readable, in particular when this will be used on the builtin func outside of the LLVM dialect. Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D109209	2021-09-03 21:21:46 +00:00
Alexander Belyaev	5ee5bbd0ff	[mlir][linalg] Extend tiled_loop to SCF conversion to generate scf.parallel. Differential Revision: https://reviews.llvm.org/D109230	2021-09-03 18:05:54 +02:00
Aart Bik	b6d1a31c1b	[mlir][sparse] refine heuristic for iteration graph topsort The sparse index order must always be satisfied, but this may give a choice in topsorts for several cases. We broke ties in favor of any dense index order, since this gives good locality. However, breaking ties in favor of pushing unrelated indices into sparse iteration spaces gives better asymptotic complexity. This revision improves the heuristic. Note that in the long run, we are really interested in using ML for ML to find the best loop ordering as a replacement for such heuristics. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D109100	2021-09-03 08:37:15 -07:00
Matthias Springer	4fa6c2734c	[mlir][scf] Allow runtime type of iter_args to change The limitation on iter_args introduced with D108806 is too restricting. Changes of the runtime type should be allowed. Extends the dim op canonicalization with a simple analysis to determine when it is safe to canonicalize. Differential Revision: https://reviews.llvm.org/D109125	2021-09-03 10:03:05 +09:00
Kiran Chandramohan	711aa35759	[MLIR][OpenMP] Add support for declaring critical construct names Add an operation omp.critical.declare to declare names/symbols of critical sections. Named omp.critical operations should use symbols declared by omp.critical.declare. Having a declare operation ensures that the names of critical sections are global and unique. In the lowering flow to LLVM IR, the OpenMP IRBuilder creates unique names for critical sections. Reviewed By: ftynse, jeanPerier Differential Revision: https://reviews.llvm.org/D108713	2021-09-02 14:31:19 +00:00
Weiwei Li	a79d7c2c85	[mlir][SPIRV] Add Image Operands for Image Instructions This patch is to add Image Operands in SPIR-V Dialect and also let ImageDrefGather to use Image Operands. Image Operands are used in many image instructions. "Image Operands encodes what oprands follow, as per Image Operands". And ususally, they are optional to image instructions. The format of image operands looks like: %0 = spv.ImageXXXX %1, ... %3 : f32 ["Bias\|Lod"](%4, %5 : f32, f32) -> ... This patch doesn’t implement all operands (see Section 3.14 in SPIR-V Spec) but provides a skeleton of it. There is TODO in verifyImageOperands function. Co-authored: Alan Liu <alanliu.yf@gmail.com> Reviewed by: antiagainst Differential Revision: https://reviews.llvm.org/D108501	2021-09-02 04:14:17 +08:00
MaheshRavishankar	b686fdbf92	[mlir][Linalg] Drop output tensor from `linalg.pad_tensor` op. The output tensor was added for tiling purposes. With use of `TilingInterface` for tiling pad operations, there is no need for an explicit operand for the shape of result of `linalg.pad_tensor` op. The interface allows the tiling pattern to query the value that can be used for the "init" needed for tiling dynamically. Differential Revision: https://reviews.llvm.org/D108613	2021-08-31 11:12:24 -07:00
Mehdi Amini	387f95541b	Add a new interface allowing to set a default dialect to be used for printing/parsing regions Currently the builtin dialect is the default namespace used for parsing and printing. As such module and func don't need to be prefixed. In the case of some dialects that defines new regions for their own purpose (like SpirV modules for example), it can be beneficial to change the default dialect in order to improve readability. Differential Revision: https://reviews.llvm.org/D107236	2021-08-31 17:52:40 +00:00
Mehdi Amini	c41b16c26b	Change ASM Op printer to print the operation name in the framework instead of leaving it up to each individual operation This aligns the printer with the parser contract: the operation isn't part of the user-controllable part of the syntax. Differential Revision: https://reviews.llvm.org/D108804	2021-08-31 17:52:40 +00:00
Tres Popp	44485fcd97	[mlir] Prevent assertion failure in DropUnitDims Don't assert fail on strided memrefs when dropping unit dims. Instead just leave them unchanged. Differential Revision: https://reviews.llvm.org/D108205	2021-08-31 12:15:13 +02:00
marina kolpakova a.k.a. geexie	0080d2aa55	[mlir][gpu] folds memref.dim of gpu.alloc implements canonicalization which folds memref.dim(gpu.alloc(%size), %idx) -> %size Differential Revision: https://reviews.llvm.org/D108892	2021-08-31 12:33:10 +03:00
MaheshRavishankar	ba72cfe734	[mlir] Add an interface to allow operations to specify how they can be tiled. An interface to allow for tiling of operations is introduced. The tiling of the linalg.pad_tensor operation is modified to use this interface. Differential Revision: https://reviews.llvm.org/D108611	2021-08-30 16:31:18 -07:00
Matthias Springer	d18ffd61d4	[mlir][SCF] Canonicalize dim(x) where x is an iter_arg * Add `DimOfIterArgFolder`. * Move existing cross-dialect canonicalization patterns to `LoopCanonicalization.cpp`. * Rename `SCFAffineOpCanonicalization` pass to `SCFForLoopCanonicalization`. * Expand documentaton of scf.for: The type of loop-carried variables may not change with iterations. (Not even the dynamic type.) Differential Revision: https://reviews.llvm.org/D108806	2021-08-30 01:39:56 +00:00
Aart Bik	0a7b8cc5dd	[mlir][sparse] fully implement sparse tensor to sparse tensor conversions with rigorous integration test Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D108721	2021-08-27 15:08:18 -07:00
Matthias Springer	a9cff97f94	[mlir][SCF] Generalize AffineMinSCFCanonicalization to min/max ops * Add support for affine.max ops to SCF loop peeling pattern. * Add support for affine.max ops to `AffineMinSCFCanonicalizationPattern`. * Rename `AffineMinSCFCanonicalizationPattern` to `AffineOpSCFCanonicalizationPattern`. * Rename `AffineMinSCFCanonicalization` pass to `SCFAffineOpCanonicalization`. Differential Revision: https://reviews.llvm.org/D108009	2021-08-25 10:40:34 +09:00
Matthias Springer	2de2dbef2a	[mlir][linalg] Replace AffineMinSCFCanonicalizationPattern with SCF reimplementation Use the new canonicalization pattern in the SCF dialect. Differential Revision: https://reviews.llvm.org/D107732	2021-08-25 08:52:56 +09:00
Matthias Springer	98aa694d0d	[mlir][scf] Add general affine.min canonicalization pattern This canonicalization simplifies affine.min operations inside "for loop"-like operations (e.g., scf.for and scf.parallel) based on two invariants: * iv >= lb * iv < lb + step * ((ub - lb - 1) floorDiv step) + 1 This commit adds a new pass `canonicalize-scf-affine-min` (instead of being a canonicalization pattern) to avoid dependencies between the Affine dialect and the SCF dialect. Differential Revision: https://reviews.llvm.org/D107731	2021-08-25 07:32:30 +09:00
Tyler Augustine	d25e91d7f6	Support alias.scope and noalias metadata Introduces new Ops to represent 1. alias.scope metadata in LLVM, and 2. domains for these scopes. These correspond to the metadata described in https://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata. Lists of scopes are modeled the same way as access groups - as an ArrayAttr on the Op (added in https://reviews.llvm.org/D97944). Lowering 'noalias' attributes on function parameters is already supported. However, lowering `noalias` metadata on individual Ops is not, which is added in this change. LLVM uses the same keyword for these, but this change introduces a separate attribute name 'noalias_scopes' to represent this distinct concept. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D107870	2021-08-24 20:42:59 +02:00
Matthias Springer	ebf35370ff	[mlir][tensor] Insert explicit tensor.cast ops for insert_slice src If additional static type information can be deduced from a insert_slice's size operands, insert an explicit cast of the op's source operand. This enables other canonicalization patterns that are matching for tensor_cast ops such as `ForOpTensorCastFolder` in SCF. Differential Revision: https://reviews.llvm.org/D108617	2021-08-24 19:45:04 +09:00
MaheshRavishankar	b546f4347b	[mlir]Linalg] Allow controlling fusion of linalg.generic -> linalg.tensor_expand_shape. Differential Revision: https://reviews.llvm.org/D108565	2021-08-23 16:28:10 -07:00
Aart Bik	236a90802d	[mlir][sparse] replace support lib conversion with actual MLIR codegen Rationale: Passing in a pointer to the memref data in order to implement the dense to sparse conversion was a bit too low-level. This revision improves upon that approach with a cleaner solution of generating a loop nest in MLIR code itself that prepares the COO object before passing it to our "swiss army knife" setup. This is much more intuitive and now also allows for dynamic shapes. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D108491	2021-08-23 14:26:05 -07:00
Matthias Springer	bc194a5bb5	[mlir][SCF] Do not peel loops inside partial iterations Do not apply loop peeling to loops that are contained in the partial iteration of an already peeled loop. This is to avoid code explosion when dealing with large loop nests. Can be controlled with a new pass option `skip-partial`. Differential Revision: https://reviews.llvm.org/D108542	2021-08-23 21:35:46 +09:00
Rob Suderman	871c812483	[mlir][linalg] Finish refactor of TC ops to YAML Multiple operations were still defined as TC ops that had equivalent versions as YAML operations. Reducing to a single compilation path guarantees that frontends can lower to their equivalent operations without missing the optimized fastpath. Some operations are maintained purely for testing purposes (mainly conv{1,2,3}D as they are included as sole tests in the vectorizaiton transforms. Differential Revision: https://reviews.llvm.org/D108169	2021-08-20 12:35:04 -07:00
Aart Bik	758ccf8506	[mlir][sparse] add test for DimOp folding Folding in the MLIR uses the order of the type directly but folding in the underlying implementation must take the dim ordering into account. These tests clarify that behavior and verify it is done right. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D108474	2021-08-20 11:24:09 -07:00
Morten Borup Petersen	6c1436a9b0	[MLIR][SCF] Parenthesize multiple return types in scf.execute_region asm op Previously, ExecuteRegionOps with multiple return values would fail a round-trip test due to missing parenthesis around the types. Differential Revision: https://reviews.llvm.org/D108402	2021-08-19 21:31:51 +01:00
Matthias Springer	76a1861816	[mlir][SparseTensor] Split scf.for loop into masked/unmasked parts Apply the "for loop peeling" pattern from SCF dialect transforms. This pattern splits scf.for loops into full and partial iterations. In the full iteration, all masked loads/stores are canonicalized to unmasked loads/stores. Differential Revision: https://reviews.llvm.org/D107733	2021-08-19 21:53:11 +09:00
Matthias Springer	8e8b70aa84	[mlir][scf] Simplify affine.min ops after loop peeling Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body. affine.min ops such as: ``` map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)> %r = affine.min #affine.min #map(%iv)[%step, %ub] ``` are rewritten them into (in the case the peeled loop): ``` %r = %step ``` To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized. Differential Revision: https://reviews.llvm.org/D107222	2021-08-19 17:24:53 +09:00
Matthias Springer	08dbed8a57	[mlir][linalg] Canonicalize dim ops of tiled_loop block args E.g.: ``` %y = ... : tensor<...> linalg.tiled_loop ... ins(%x = %y : tensor<...>) { tensor.dim %x, %c0 : tensor<...> } ``` is rewritten to: ``` %y = ... : tensor<...> linalg.tiled_loop ... ins(%x = %y : tensor<...>) { tensor.dim %y, %c0 : tensor<...> } ``` Differential Revision: https://reviews.llvm.org/D108272	2021-08-19 11:24:33 +09:00
Aart Bik	d37d72eaf8	[mlir][sparse] use shared util for DimOp generation This shares more code with existing utilities. Also, to be consistent, we moved dimension permutation on the DimOp to the tensor lowering phase. This way, both pre-existing DimOps on sparse tensors (not likely but possible) as well as compiler generated DimOps are handled consistently. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D108309	2021-08-18 17:12:32 -07:00
Chia-hung Duan	41e5dbe0fa	Enables inferring return types for Shape op if possible Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D102565	2021-08-18 21:36:55 +00:00
Butygin	ddc3d51d58	[mlir][spirv] Add (InBounds)PtrAccessChain ops Differential Revision: https://reviews.llvm.org/D108070	2021-08-18 17:59:21 +03:00
Lei Zhang	4c15ad2321	[mlir][linalg] Don't drop existing attributes when creating ops Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D108219	2021-08-17 15:44:56 -04:00
Robert Suderman	65532ea6dd	[mlir][linalg] Clear unused linalg tc operations These operations are not lowered to from any source dialect and are only used for redundant tests. Removing these named ops, along with their associated tests, will make migration to YAML operations much more convenient. Reviewed By: stellaraccident Differential Revision: https://reviews.llvm.org/D107993	2021-08-16 11:55:45 -07:00
tashuang.zk	2d45e332ba	[MLIR][DISC] Revise ParallelLoopTilingPass with inbound_check mode Expand ParallelLoopTilingPass with an inbound_check mode. In default mode, the upper bound of the inner loop is from the min op; in inbound_check mode, the upper bound of the inner loop is the step of the outer loop and an additional inbound check will be emitted inside of the inner loop. This was 'FIXME' in the original codes and a typical usage is for GPU backends, thus the outer loop and inner loop can be mapped to blocks/threads in seperate. Differential Revision: https://reviews.llvm.org/D105455	2021-08-16 14:02:53 +02:00
harsh-nod	e33f301ec2	[mlir] Add support for moving reductions to outer most dimensions in vector.multi_reduction The approach for handling reductions in the outer most dimension follows that for inner most dimensions, outlined below First, transpose to move reduction dims, if needed Convert reduction from n-d to 2-d canonical form Then, for outer reductions, we emit the appropriate op (add/mul/min/max/or/and/xor) and combine the results. Differential Revision: https://reviews.llvm.org/D107675	2021-08-13 12:59:50 -07:00
Tyler Augustine	3a2ff982d7	Support post-processing Ops in unrolled loop iterations This can be useful when one needs to know which unrolled iteration an Op belongs to, for example, conveying noalias information among memory-affecting ops in parallel-access loops. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D107789	2021-08-11 23:11:10 +00:00
Rob Suderman	7de439b2be	[mlir][tosa] Migrate tosa to more efficient linalg.conv Existing linalg.conv2d is not well optimized for performance. Changed to a version that is more aligned for optimziation. Include the corresponding transposes to use this optimized version. This also splits the conv and depthwise conv into separate implementations to avoid overly complex lowerings. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D107504	2021-08-11 11:05:12 -07:00
Benjamin Kramer	c1ebefdf77	[mlir] Make polynomial approximation emit std instead of LLVM ops This is a bit cleaner and removes issues with 2d vectors. It also has a big impact on constant folding, hence the test changes. Differential Revision: https://reviews.llvm.org/D107896	2021-08-11 16:37:21 +02:00
Alex Zinenko	79b0576dd4	[mlir] Tighten LLVM_AnyNonAggregate ODS type constraint The constraint was checking that the type is not an LLVM structure or array type, but was not checking that it is an LLVM-compatible type, making it accept incorrect types. As a result, some LLVM dialect ops could process values that are not compatible with the LLVM dialect leading to further issues with conversions and translations that assume all values are LLVM-compatible. Make LLVM_AnyNonAggregate only accept LLVM-compatible types. Reviewed By: cota, akuegel Differential Revision: https://reviews.llvm.org/D107889	2021-08-11 16:30:19 +02:00
Alexander Belyaev	1e733a8c04	Revert "Bufferization for tiled loop." This reverts commit `edaffebcb2`.	2021-08-11 10:04:12 +02:00
Alexander Belyaev	967578f0b8	Revert "[mlir] Change the pattern for TiledLoopOp bufferization." This reverts commit `2f946eaa9d`.	2021-08-11 10:01:36 +02:00
Rob Suderman	2b2ebb6f98	[mlir][tosa] Add folders for trivial tosa operation cases Some folding cases are trivial to fold away, specifically no-op cases where an operation's input and output are the same. Canonicalizing these away removes unneeded operations. The current version includes tensor cast operations to resolve shape discreprencies that occur when an operation's result type differs from the input type. These are resolved during a tosa shape propagation pass. Reviewed By: NatashaKnk Differential Revision: https://reviews.llvm.org/D107321	2021-08-10 14:43:00 -07:00
Alexander Belyaev	2f946eaa9d	[mlir] Change the pattern for TiledLoopOp bufferization. This version is does not affect the patterns for Extract/InsertSliceOp and LinalgOps. Differential Revision: https://reviews.llvm.org/D107858	2021-08-10 21:27:02 +02:00
bakhtiyar	391456f33c	Fix a bug in algebraic simplification, and enable the tests. Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D107788	2021-08-10 04:15:56 -07:00
Alexander Belyaev	edaffebcb2	Cloned from CL 389610703 by 'g4 patch'. Original change by pifon@pifon:tfrt_clean:6896:citc on 2021/08/09 05:30:17. Ad b Differential Revision: https://reviews.llvm.org/D107762	2021-08-09 21:57:06 +02:00
Aart Bik	05c7f450df	[mlir][sparse] add dense to sparse conversion implementation Implements lowering dense to sparse conversion, for static tensor types only. First step towards general sparse_tensor.convert support. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D107681	2021-08-09 12:12:39 -07:00
Max Kudryavtsev	0b8cb87e0d	[MLIR][STD] Add safe scalar constant propagation for FPTruncOp Perform scalar constant propagation for FPTruncOp only if the resulting value can be represented without precision loss or rounding. Example: %cst = constant 1.000000e+00 : f32 %0 = fptrunc %cst : f32 to bf16 --> %cst = constant 1.000000e+00 : bf16 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D107518	2021-08-06 16:31:29 -07:00
Alexander Belyaev	a552debdcf	[mlir] Add patterns for vector.transfer_read/write to Linalg bufferization. Differential Revision: https://reviews.llvm.org/D107643	2021-08-06 20:24:44 +02:00
Geoffrey Martin-Noble	ca6baf1e1d	[MLIR][std] Introduce bitcast operation This patch introduces a bitcast operation to the standard dialect. RFC: https://llvm.discourse.group/t/rfc-introduce-a-bitcast-op/3774 Reviewed By: silvas Differential Revision: https://reviews.llvm.org/D105376	2021-08-06 08:47:51 -07:00
Adrian Kuegel	d6b4993736	[mlir][MemRef] Fix canonicalization of BufferCast(TensorLoad). CastOp::areCastCompatible does not check whether casts are definitely compatible. When going from dynamic to static offset or stride, the canonicalization cannot know whether it is really cast compatible. In that case, it can only canonicalize to an alloc plus copy. Differential Revision: https://reviews.llvm.org/D107545	2021-08-06 08:32:35 +02:00
Jacques Pienaar	9d10be70a8	[mlir] std.call reference function return types in failure Makes it easier to see type mismatch from failure locally. Differential Revision: https://reviews.llvm.org/D107288	2021-08-05 19:51:48 -07:00
Stephen Neuendorffer	432341d8a8	[mlir] Handle cases where transfer_read should turn into a scalar load The existing vector transforms reduce the dimension of transfer_read ops. However, beyond a certain point, the vector op actually has to be reduced to a scalar load, since we can't load a zero-dimension vector. This handles this case. Note that in the longer term, it may be preferaby to support zero-dimension vectors. see https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097. Differential Revision: https://reviews.llvm.org/D103432	2021-08-03 22:53:40 -07:00

... 4 5 6 7 8 ...

2129 Commits