llvm-project

Commit Graph

Author	SHA1	Message	Date
Arjun P	4bf9cbc408	[MLIR][Presburger] subtract: improve redundant constraint detection When constraints in the two operands make each other redundant, prefer constraints of the second because this affects the number of sets in the output at each level; reducing these can help prevent exponential blowup. This is accomplished by adding extra overloads to Simplex::detectRedundant that only scan a subrange of the constraints for redundancy. Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D127237	2022-06-08 14:44:31 -04:00
wren romano	0371ddf9ad	[mlir] Refactoring the tablegen Tensor types Reduces repetition in tablegen files for defining various tensor types. In particular the goal is to reduce the repetition when defining new tensor types (e.g., D126994). Reviewed By: aartbik, rriddle Differential Revision: https://reviews.llvm.org/D127039	2022-06-08 11:33:48 -07:00
bixia1	6c6eddb617	[mlir] Lower complex.power and complex.rsqrt to standard dialect. Add conversion tests and correctness tests. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D127255	2022-06-08 10:53:53 -07:00
dime10	4f55ed5a1e	Add Python bindings for the OpaqueType Implement the C-API and Python bindings for the builtin opaque type, which was previously missing. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D127303	2022-06-08 19:51:00 +02:00
Mogball	ee70039ae2	[mlir] Fix handling of some region branch terminator successors When `RegionBranchOpInterface::getSuccessorRegions` is called for anything other than the parent op, it expects the operands of the terminator of the source region to be passed, not the operands of the parent op. This was not always respected. This fixes a bug in integer range inference and ForwardDataFlowSolver and changes `scf.while` to allow narrowing of successors using constant inputs. Fixes #55873 Reviewed By: mehdi_amini, krzysz00 Differential Revision: https://reviews.llvm.org/D127261	2022-06-08 17:17:03 +00:00
bixia1	ea8ed5cbcf	[mlir][sparse] Add F16 and BF16. This is the first PR to add `F16` and `BF16` support to the sparse codegen. There are still problems in supporting these two data types, such as `BF16` is not quite working yet. Add tests cases. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D127010	2022-06-08 09:51:05 -07:00
Lei Zhang	2dfefe0283	[mlir][spirv] NFC: fix typo in UnifyAliasedResourcePass pass Reviewed By: ThomasRaoux, hanchung Differential Revision: https://reviews.llvm.org/D127265	2022-06-08 08:18:12 -07:00
lorenzo chelini	a0fc94ab61	[MLIR][Math] Add round operation Introduce RoundOp in the math dialect. The operation rounds the operand to the nearest integer value in floating-point format. RoundOp lowers to LLVM intrinsics 'llvm.intr.round' or as a function call to libm (round or roundf). Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D127286	2022-06-08 13:07:39 +02:00
Matthias Springer	032be23309	[mlir][bufferize] Improve buffer writability analysis Find writability conflicts (writes to buffers that are not allowed to be written to) by checking SSA use-def chains. This is better than the current writability analysis, which is too conservative and finds false positives. Differential Revision: https://reviews.llvm.org/D127256	2022-06-08 10:11:52 +02:00
Benjamin Kramer	6eb0f8e285	[mlir][MemRef] Fix a crash when expanding a scalar shape In this case the reassociation is empty, yielding no strides for the result type. Differential Revision: https://reviews.llvm.org/D127232	2022-06-08 09:37:40 +02:00
lorenzo chelini	d48479791f	[MLIR][SCF] Improve doc (NFC)	2022-06-08 08:46:36 +02:00
Nathan Lanza	f46ce03734	[MLIR] Add an install target for mlir-libraries This is required for the distribution system for installing the mlir-libraries component. This is copied from clang's equivalent feature. Differential Revision: https://reviews.llvm.org/D126837	2022-06-07 22:57:07 -04:00
Aart Bik	7482cd6869	[mlir][sparse] updated our sparse dialect doc with some recent changes The `init` and `tensor` ops are renamed (and one moved to another dialect). Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D127169	2022-06-07 14:27:57 -07:00
Christopher Bate	53fe155b3f	Revert "[mlir][vector] Allow unroll of contraction in arbitrary order" Reverts commit `1469ebf838` (original commit) Reverts commit `a392a39f75` (build fix for above commit) The commit broke tests in out-of-tree projects, indicating that some logical error was made in the previous change but not covered by current tests.	2022-06-07 14:54:01 -06:00
Groverkss	445e2b2aa0	[MLIR][Presburger] Fix subtract processing extra inequalities This patch fixes a bug in PresburgeRelation::subtract that made it process the inequality at index 0, multiple times. This was caused by allocating memory instead of reserving memory in llvm::SmallVector. Reviewed By: arjunp Differential Revision: https://reviews.llvm.org/D127228	2022-06-07 22:51:03 +05:30
Kiran Chandramohan	dd32bf9a77	[Flang,MLIR,OpenMP] Fix a few tests that were not converting to LLVM A few OpenMP tests were retaining the FIR operands even after running the LLVM conversion pass. To fix these tests the legality checkes for OpenMP conversion are made stricter to include operands and results. The Flush, Single and Sections operations are added to conversions or legality checks. The RegionLessOpConversion is appropriately renamed to clarify that it works only for operations with Variable operands. The operands of the flush operation are changed to match those of Variable Operands. Fix for an OpenMP issue mentioned in https://github.com/llvm/llvm-project/issues/55210. Reviewed By: shraiysh, peixin, awarzynski Differential Revision: https://reviews.llvm.org/D127092	2022-06-07 09:55:53 +00:00
Alex Zinenko	3326eddcd1	[mlir] fix documentation format in SCF Four leading spaces are interpreted as a code block in markdown. Unless used consistently in ODS op description, they cannot be stripped away by the tablegen backend, which results in malformed markdown being generated.	2022-06-07 11:51:24 +02:00
Alexander Batashev	8324561e33	[mlir][spirv] Correctly deduce PhysicalStorageBuffer64 addressing model According to the SPIR-V specification[1], PhysicalStorageBuffer storage class can only be used iff addressing model is PhysicalStorageBuffer64. [1]: https://www.khronos.org/registry/SPIR-V/specs/unified1/SPIRV.html#_addressing_model Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D127067	2022-06-07 12:14:38 +03:00
lorenzo chelini	9b3712e0bf	[MLIR][LLVMIR] Add round intrinsic Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D126879	2022-06-07 10:27:55 +02:00
lewuathe	62a34f6a6f	[mlir][complex] Add complex.conj op Add complex.conj op to calculate the complex conjugate which is widely used for the mathematical operation on the complex space. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D127181	2022-06-07 09:38:35 +02:00
lorenzo chelini	2cbf0b3dc6	[MLIR][SCF] Fix top-level comment (NFC)	2022-06-07 08:52:11 +02:00
River Riddle	a3a4f0335f	[vscode-mlir] Bump to version 0.9 Since version 0.8 we've added: * Switched PDLL and TableGen to use incremental doc updates * Added support to PDLL for inlay hints	2022-06-06 20:20:19 -07:00
River Riddle	5919eab55c	[mlir:PDLL] Add support for inlay hints These allow for displaying additional inline information, such as the types of variables, names operands/results, constraint/rewrite arguments, etc. This requires a bump in the vscode extension to a newer version, as inlay hints are a new LSP feature. Differential Revision: https://reviews.llvm.org/D126033	2022-06-06 20:20:19 -07:00
River Riddle	6187178e83	[mlir:LSP] Switch document sync mode to Incremental This is much more efficient over the full mode, as it only requires sending smalls chunks of files. It also works around a weird command ordering issue (full document updates are being sent after other commands like code completion) in newer versions of vscode. Differential Revision: https://reviews.llvm.org/D126032	2022-06-06 20:20:19 -07:00
River Riddle	1b501cbcbb	[mlir] Add documentation for TableGen LSP features and setup This commit beefs up the documentation for MLIR language servers by adding proper documentations/examples/etc for the provided TableGen language server capabilities. Given that this documentation is also used for the vscode extension, this commit also updates the user facing vscode extension documentation. Note that the images referenced in the new documentation are hosted on the website, and will be commited to mlir-www shortly after this commit lands.	2022-06-06 18:29:31 -07:00
Georgios Pinitas	3bcaf2eb93	[mlir][tosa] Moves constant folding operations out of the Canonicalizer Transpose operations on constant data were getting folded during the canonicalization process. This has compile time cost proportional to the constant size. Moving this to a separate pass to enable optionality and flexibility of how such scenarios can be handled. Reviewed By: rsuderman, jpienaar, stellaraccident Differential Revision: https://reviews.llvm.org/D124685	2022-06-06 22:10:22 +00:00
Christopher Bate	a392a39f75	[mlir][vector] fix typo in vector unroll transform	2022-06-06 16:09:13 -06:00
Christopher Bate	1469ebf838	[mlir][vector] Allow unroll of contraction in arbitrary order Adds supprot for vector unroll transformations to unroll in different orders. For example, the `vector.contract` can be unrolled into a smaller set of contractions. There is a choice of how to unroll the decomposition based on the traversal order of (dim0, dim1, dim2). The choice of traversal order can now be specified by a callback which given by the caller of the transform. For now, only the `vector.contract`, `vector.transfer_read/transfer_write` operations support the callback. Differential Revision: https://reviews.llvm.org/D127004	2022-06-06 14:31:04 -06:00
River Riddle	731dfca8a0	[mlir] Add documentation for PDLL LSP features and setup This commit beefs up the documentation for MLIR language servers by adding proper documentations/examples/etc for the provided PDLL language server capabilities. Given that this documentation is also used for the vscode extension, this commit also updates the user facing vscode extension documentation. Not that the images referenced in the new documentation are hosted on the website, and will be commited to mlir-www shortly after this commit lands. Differential Revision: https://reviews.llvm.org/D125650	2022-06-06 13:13:54 -07:00
Christopher Bate	cca662b849	[mlir][linalg] add conv_2d_nhwc_fhwc named op This operation should be supported as a named op because when the operands are viewed as having canonical layouts with decreasing strides, then the "reduction" dimensions of the filter (h, w, and c) are contiguous relative to each output channel. When lowered to a matrix multiplication, this layout is the simplest to deal with, and thus future transforms/vectorizations of `conv2d` may find using this named op convenient. Differential Revision: https://reviews.llvm.org/D126995	2022-06-06 13:18:08 -06:00
Christopher Bate	99069ab212	[mlir][linalg] fix crash when promoting rank-reducing memref.subviews This change adds support for promoting `linalg` operation operands that are produced by rank-reducing `memref.subview` ops. Differential Revision: https://reviews.llvm.org/D127086	2022-06-06 12:06:36 -06:00
jacquesguan	ad44495ad3	[mlir][NFC] Replace some llvm::find with llvm::is_contained. This patch replaces some llvm::find with llvm::is_contained, it should be more clear. Differential Revision: https://reviews.llvm.org/D127077	2022-06-06 03:01:14 +00:00
Stella Laurenzo	768a251587	[mlir] Tunnel LLVM_USE_LINKER through to the standalone example build. When building in debug mode, the link time of the standalone sample is excessive, taking upwards of a minute if using BFD. This at least allows lld to be used if the main invocation was configured that way. On my machine, this gets a standalone test that requires a relink to run in ~13s for Debug mode. This is still a lot, but better than it was. I think we may want to do something about this test: it adds a lot of latency to a normal compile/test cycle and requires a bunch of arg fiddling to exclude. I think we may end up wanting a `check-mlir-heavy` target that can be used just prior to submit, and then make `check-mlir` just run unit/lite tests. More just thoughts for the future (none of that is done here). Reviewed By: bondhugula, mehdi_amini Differential Revision: https://reviews.llvm.org/D126585	2022-06-05 12:31:41 -07:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Christian Sigg	400fef081a	Recommit: "[MLIR][NVVM] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration." This change rolls `bcfc0a9051` forward (i.e., reverting `369ce54bb3`) with fixed CMakeLists.txt.	2022-06-05 09:11:43 +02:00
Jacques Pienaar	29794ab0fa	[mlir] Use context provided rather than getContext Avoids "pass state was never initialized" assertion failure.	2022-06-04 12:18:51 -07:00
Mehdi Amini	369ce54bb3	Revert "[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration." This reverts commit `bcfc0a9051`. The build is broken with shared library enabled.	2022-06-04 08:35:45 +00:00
Christian Sigg	bcfc0a9051	[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration. This is correct for all values, i.e. the same as promoting the division to fp32 in the NVPTX backend. But it is faster (~10% in average, sometimes more) because: - it performs less Newton iterations - it avoids the slow path for e.g. denormals - it allows reuse of the reciprocal for multiple divisions by the same divisor Test program: ``` #include <stdio.h> #include "cuda_fp16.h" // This is a variant of CUDA's own __hdiv which is fast than hdiv_promote below // and doesn't suffer from the perf cliff of div.rn.fp32 with 'special' values. __device__ half hdiv_newton(half a, half b) { float fa = __half2float(a); float fb = __half2float(b); float rcp; asm("{rcp.approx.ftz.f32 %0, %1;\n}" : "=f"(rcp) : "f"(fb)); float result = fa * rcp; auto exponent = reinterpret_cast<const unsigned&>(result) & 0x7f800000; if (exponent != 0 && exponent != 0x7f800000) { float err = __fmaf_rn(-fb, result, fa); result = __fmaf_rn(rcp, err, result); } return __float2half(result); } // Surprisingly, this is faster than CUDA's own __hdiv. __device__ half hdiv_promote(half a, half b) { return __float2half(__half2float(a) / __half2float(b)); } // This is an approximation that is accurate up to 1 ulp. __device__ half hdiv_approx(half a, half b) { float fa = __half2float(a); float fb = __half2float(b); float result; asm("{div.approx.ftz.f32 %0, %1, %2;\n}" : "=f"(result) : "f"(fa), "f"(fb)); return __float2half(result); } __global__ void CheckCorrectness() { int i = threadIdx.x + blockIdx.x * blockDim.x; half x = reinterpret_cast<const half&>(i); for (int j = 0; j < 65536; ++j) { half y = reinterpret_cast<const half&>(j); half d1 = hdiv_newton(x, y); half d2 = hdiv_promote(x, y); auto s1 = reinterpret_cast<const short&>(d1); auto s2 = reinterpret_cast<const short&>(d2); if (s1 != s2) { printf("%f (%u) / %f (%u), got %f (%hu), expected: %f (%hu)\n", __half2float(x), i, __half2float(y), j, __half2float(d1), s1, __half2float(d2), s2); //__trap(); } } } __device__ half dst; __global__ void ProfileBuiltin(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = x / x; } dst = x; } __global__ void ProfilePromote(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_promote(x, x); } dst = x; } __global__ void ProfileNewton(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_newton(x, x); } dst = x; } __global__ void ProfileApprox(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_approx(x, x); } dst = x; } int main() { CheckCorrectness<<<256, 256>>>(); half one = __float2half(1.0f); ProfileBuiltin<<<1, 1>>>(one); // 1.001s ProfilePromote<<<1, 1>>>(one); // 0.560s ProfileNewton<<<1, 1>>>(one); // 0.508s ProfileApprox<<<1, 1>>>(one); // 0.304s auto status = cudaDeviceSynchronize(); printf("%s\n", cudaGetErrorString(status)); } ``` Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D126158	2022-06-04 08:03:29 +02:00
wren romano	3cf03f1c56	[mlir][sparse] Adding IsSparseTensorPred and updating ops to use it Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D126994	2022-06-03 17:15:31 -07:00
Christopher Bate	9f819f4c62	[mlir][linalg] fix crash in vectorization of elementwise operations The current vectorization logic implicitly expects "elementwise" linalg ops to have projected permutations for indexing maps, but the precondition logic misses this check. This can result in a crash when executing the generic vectorization transform on an op with a non-projected permutation input indexing map. This change fixes the logic and adds a test (which crashes without this fix). Differential Revision: https://reviews.llvm.org/D127000	2022-06-03 16:38:13 -06:00
Diego Caballero	9a79b1b04c	[mlir] Add peeling xform to Codegen Strategy This patch adds the knobs to use peeling in the codegen strategy infrastructure. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D126842	2022-06-03 21:31:43 +00:00
Krzysztof Drewniak	95aff23e29	Re-land "[mlir] Add integer range inference analysis"" This reverts commit `4e5ce2056e`. This relands commit `1350c9887d`. Reinstates the range analysis with the build issue fixed. Differential Revision: https://reviews.llvm.org/D126926	2022-06-03 17:13:48 +00:00
lewuathe	d4141c93a8	[mlir][complex] Check the correctness of tanh in complex dialect Correctness check for tanh operation in complex dialect. Ref: https://reviews.llvm.org/D126858 Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D126946	2022-06-03 14:04:48 +02:00
Adrian Kuegel	39f28397e2	[mlir] Fix ClangTidy warning (NFC). virtual is redundant since the function is already declared 'override'.	2022-06-03 12:46:14 +02:00
Shraiysh Vaishay	f5d29c15bf	[mlir][OpenMP] Add memory_order clause tests This patch adds tests for memory_order clause for atomic update and capture operations. This patch also adds a check for making sure that the operations inside and omp.atomic.capture region do not specify the memory_order clause. Reviewed By: kiranchandramohan, peixin Differential Revision: https://reviews.llvm.org/D126195	2022-06-03 13:41:22 +05:30
Nicolas Vasilache	72de7588cc	[mlir][SCF] Add bufferization hook for scf.foreach_thread and terminator. `scf.foreach_thread` results alias with the underlying `scf.foreach_thread.parallel_insert_slice` destination operands and they bufferize to equivalent buffers in the absence of other conflicts. `scf.foreach_thread.parallel_insert_slice` conflict detection is similar to `tensor.insert_slice` conflict detection. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D126769	2022-06-03 07:14:05 +00:00
Alexander Batashev	b34fb277df	[mlir][cf] Implement missing SwitchOp::build function Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D126594	2022-06-03 09:08:04 +03:00
Thomas Raoux	271a48e029	[mlir][VectorToGPU] Fix bug generating incorrect ldmatrix ops ldmatrix transpose can only be used with types that are 16bits wide. Differential Revision: https://reviews.llvm.org/D126846	2022-06-03 04:30:22 +00:00
Thomas Raoux	205c08b54d	[mlir][scf] Add option to loop pipelining to not peel the epilogue Add an option to predicate the epilogue within the kernel instead of peeling the epilogue. This is a useful option to prevent generating large amount of code for deep pipeline. This currently require a user lamdba to implement operation predication. Differential Revision: https://reviews.llvm.org/D126753	2022-06-03 04:20:20 +00:00
River Riddle	ee1cf1f645	[mlir][NFC] Simplify the various `parseSourceFile<T>` overloads These effectively all share the same implementation, i.e. forward to the non-templated overload and then construct the container op.	2022-06-02 19:18:55 -07:00

1 2 3 4 5 ...

11491 Commits