llvm-project

Commit Graph

Author	SHA1	Message	Date
Diego Caballero	96891f0418	Reland: [mlir][Vector][Affine] Improve affine vectorizer algorithm This patch replaces the root-terminal vectorization approach implemented in the Affine vectorizer with a topological order approach that vectorizes all the operations within the target loop nest. These are the most important changes introduced by the new algorithm: * Removed tracking of root and terminal ops. Existing vectorization functionality is preserved and extended so that loop nests without root-terminal chains can be vectorized. * Vectorizing a loop nest now only requires a single topological traversal. * A new vector loop nest is incrementally built along the vectorization process. The original scalar loop is kept intact. No cloning guard is needed to recover the scalar loop if vectorization fails. This approach also simplifies the challenging task of replacing a loop operation amid the vectorization process without invalidating the analysis information that depends on the original loop. * Vectorization of specific operations has been implemented as independent, preparing them to be moved to a potential vectorization interface. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D97442	2021-03-12 00:19:50 +02:00
River Riddle	31bb8efd69	[mlir][StorageUniquer] Properly call the destructor on non-trivially destructible storage instances This allows for storage instances to store data that isn't uniqued in the context, or contain otherwise non-trivial logic, in the rare situations that they occur. Storage instances with trivial destructors will still have their destructor skipped. A consequence of this is that the storage instance definition must be visible from the place that registers the type. Differential Revision: https://reviews.llvm.org/D98311	2021-03-11 11:35:32 -08:00
Diego Caballero	ed193bce9d	[mlir][Vector][Affine] Fix heap-use-after-free in vectorizer This patch fixes a heap-use-after-free introduced by the recent changes in the vectorizer: https://reviews.llvm.org/rG95db7b4aeaad590f37720898e339a6d54313422f The problem is due to the way candidate loops are visited. All candidate loops are pattern-matched beforehand using the 'NestedMatch' utility. These matches may intersect with each other so it may happen that we try to vectorize a loop that was previously vectorized. The new vectorization algorithm replaces the original loops that are vectorized with new loops and, therefore, any reference to the original loops in the pre-computed matches becomes invalid. This patch fixes the problem by classifying the candidate matches into buckets before vectorization. Each bucket contains all the matches that intersect. The vectorizer uses these buckets to make sure that we only vectorize one match from each bucket, at most. Differential Revision: https://reviews.llvm.org/D98382	2021-03-11 20:44:07 +02:00
Nikita Popov	f3f0c6cd47	[mlir] Remove uses of type-less CreateLoad() APIs (NFC) For the use in LLVMOps.td I used the getPointerElementType() escape hatch, as it's not obvious to me how the load type should be properly obtained here.	2021-03-11 18:39:20 +01:00
Alex Zinenko	27104390e8	[mlir] fix cmake build	2021-03-11 18:22:00 +01:00
Alex Zinenko	3ba14fa0ce	[mlir] Introduce data layout modeling subsystem Data layout information allows to answer questions about the size and alignment properties of a type. It enables, among others, the generation of various linear memory addressing schemes for containers of abstract types and deeper reasoning about vectors. This introduces the subsystem for modeling data layouts in MLIR. The data layout subsystem is designed to scale to MLIR's open type and operation system. At the top level, it consists of attribute interfaces that can be implemented by concrete data layout specifications; type interfaces that should be implemented by types subject to data layout; operation interfaces that must be implemented by operations that can serve as data layout scopes (e.g., modules); and dialect interfaces for data layout properties unrelated to specific types. Built-in types are handled specially to decrease the overall query cost. A concrete default implementation of these interfaces is provided in the new Target dialect. Defaults for built-in types that match the current behavior are also provided. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D97067	2021-03-11 16:54:47 +01:00
Arpith C. Jacob	b4a516cc43	[mlir] Add LLVM loop codegen options to control software pipelining Support specifying the II and disabling pipelining. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D98420	2021-03-11 16:46:44 +01:00
Tres Popp	25a20b8aa6	[mlir] Correct verifyCompatibleShapes verifyCompatibleShapes is not transitive. Create an n-ary version and update SameOperandShapes and SameOperandAndResultShapes traits to use it. Differential Revision: https://reviews.llvm.org/D98331	2021-03-11 13:04:10 +01:00
Julian Gross	2aef202981	[mlir] Fix invalid hoisting of dependent allocs in buffer hoisting pass. Buffer hoisting moves allocs upwards although it has dependency within its nested region. This patch fixes this issue. https://bugs.llvm.org/show_bug.cgi?id=49142 Differential Revision: https://reviews.llvm.org/D98248	2021-03-11 11:46:16 +01:00
Christian Sigg	bafe418d12	[mlir] Change test-gpu-to-cubin to derive from SerializeToBlobPass Clean-up after D98279, remove one call to createConvertGPUKernelToBlobPass(). Depends On D98203 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D98360	2021-03-11 10:42:20 +01:00
Frederik Gossen	b975e3b5aa	[MLIR] Add canoncalization for `shape.is_broadcastable` Canonicalize `is_broadcastable` to constant true if fewer than 2 unique shape operands. Eliminate redundant operands, otherwise. Differential Revision: https://reviews.llvm.org/D98361	2021-03-11 10:10:34 +01:00
Christian Sigg	2224221fb3	[mlir] Add NVVM to CUBIN conversion to mlir-opt If MLIR_CUDA_RUNNER_ENABLED, register a 'gpu-to-cubin' conversion pass to mlir-opt. The next step is to switch CUDA integration tests from mlir-cuda-runner to mlir-opt + mlir-cpu-runner and remove mlir-cuda-runner. Depends On D98279 Reviewed By: herhut, rriddle, mehdi_amini Differential Revision: https://reviews.llvm.org/D98203	2021-03-11 10:07:11 +01:00
Matthias Springer	c40e0d7609	[mlir][AVX512] Implement sparse vector dot product integration test. This test operates on two hardware-vector-sized vectors and utilizes vp2intersect and mask.compress. PHAB_REVIEW=D98099	2021-03-11 13:00:17 +09:00
River Riddle	4e02eb8014	[mlir] Optimize the implementation of RegionDCE The current implementation has some inefficiencies that become noticeable when running on large modules. This revision optimizes the code, and updates some out-dated idioms with newer utilities. The main components of this optimization include: * Add an overload of Block::eraseArguments that allows for O(N) erasure of disjoint arguments. * Don't process entry block arguments given that we don't erase them at this point. * Don't track individual operation results, given that we don't erase them. We can just track the parent operation. Differential Revision: https://reviews.llvm.org/D98309	2021-03-10 16:39:50 -08:00
Emilio Cota	c0891706bc	[mlir] Add polynomial approximation for math::Log2 ``` name old cpu/op new cpu/op delta BM_mlir_Log2_f32/10 134ns ±15% 45ns ± 4% -66.39% (p=0.000 n=20+17) BM_mlir_Log2_f32/100 1.03µs ±16% 0.12µs ±10% -88.78% (p=0.000 n=20+18) BM_mlir_Log2_f32/1k 10.3µs ±16% 0.7µs ± 5% -93.24% (p=0.000 n=20+17) BM_mlir_Log2_f32/10k 104µs ±15% 7µs ±14% -93.25% (p=0.000 n=20+20) BM_eigen_s_Log2_f32/10 95.3ns ±17% 90.9ns ± 6% ~ (p=0.228 n=20+18) BM_eigen_s_Log2_f32/100 907ns ± 3% 911ns ± 6% ~ (p=0.539 n=16+20) BM_eigen_s_Log2_f32/1k 9.88µs ± 4% 9.85µs ± 3% ~ (p=0.790 n=16+17) BM_eigen_s_Log2_f32/10k 105µs ±10% 110µs ±16% ~ (p=0.459 n=16+20) BM_eigen_v_Log2_f32/10 32.5ns ±31% 33.9ns ±14% +4.31% (p=0.028 n=17+20) BM_eigen_v_Log2_f32/100 176ns ± 8% 180ns ± 7% +2.19% (p=0.045 n=16+17) BM_eigen_v_Log2_f32/1k 1.44µs ± 4% 1.50µs ± 9% +3.91% (p=0.001 n=16+17) BM_eigen_v_Log2_f32/10k 14.5µs ±10% 15.0µs ± 8% +3.92% (p=0.002 n=16+19) ``` Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D98282	2021-03-10 14:49:22 -08:00
Christian Sigg	6a291ed0f0	[mlir] Remove unnecessary copying of pass options I missed a comment in D98279 that you don't need to copy pass options. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D98366	2021-03-10 21:55:28 +01:00
Weiwei Li	619c1505f9	[mlir][spirv] Define spv.Image Operation co-authered-by: Alan Liu <alanliu.yf@gmail.com> Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D98270	2021-03-10 15:48:04 -05:00
Alex Zinenko	79da91c59a	Revert "[mlir][Vector][Affine] Improve affine vectorizer algorithm" This reverts commit `95db7b4aea`. This breaks vectorize_2d.mlir and vectorize_3d.mlir test under ASAN (use after free).	2021-03-10 20:25:49 +01:00
Alex Zinenko	ed715536f1	Revert "[mlir][Affine][Vector] Add initial support for 'iter_args' to Affine vectorizer." This reverts commit `77a9d1549f`. Parent commit is broken.	2021-03-10 20:25:32 +01:00
Diego Caballero	77a9d1549f	[mlir][Affine][Vector] Add initial support for 'iter_args' to Affine vectorizer. This patch adds support for vectorizing loops with 'iter_args' when those loops are not a vector dimension. This allows vectorizing outer loops with an inner 'iter_args' loop (e.g., reductions). Vectorizing scenarios where 'iter_args' loops are vector dimensions would require more work (e.g., analysis, generating horizontal reduction, etc.) not included in this patch. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D97892	2021-03-10 20:40:21 +02:00
Diego Caballero	95db7b4aea	[mlir][Vector][Affine] Improve affine vectorizer algorithm This patch replaces the root-terminal vectorization approach implemented in the Affine vectorizer with a topological order approach that vectorizes all the operations within the target loop nest. These are the most important changes introduced by the new algorithm: * Removed tracking of root and terminal ops. Existing vectorization functionality is preserved and extended so that loop nests without root-terminal chains can be vectorized. * Vectorizing a loop nest now only requires a single topological traversal. * A new vector loop nest is incrementally built along the vectorization process. The original scalar loop is kept intact. No cloning guard is needed to recover the scalar loop if vectorization fails. This approach also simplifies the challenging task of replacing a loop operation amid the vectorization process without invalidating the analysis information that depends on the original loop. * Vectorization of specific operations has been implemented as independent, preparing them to be moved to a potential vectorization interface. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D97442	2021-03-10 20:29:58 +02:00
Vladislav Vinogradov	b599f464d4	[mlir][CMAKE] Fix build with BUILD_SHARED_LIBS=ON Link `MLIRStandardToLLVM` to `MLIRAVX512Transforms`, since the latter uses `LLVMTypeConverter` defined in the first one. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D98336	2021-03-10 14:52:36 +01:00
Alex Zinenko	e02dd790b1	[mlir] fix typo in OpDefinitions.md	2021-03-10 14:44:08 +01:00
Alex Zinenko	78f3fb4f46	[mlir] Update comments in ArmNeon dialect. NFC These were not updated when squashing LLVMArmNeon and ArmNeon dialects.	2021-03-10 13:35:57 +01:00
Alex Zinenko	a776942ba1	[mlir] squash LLVM_AVX512 dialect into AVX512 The dialect separation was introduced to demarkate ops operating in different type systems. This is no longer the case after the LLVM dialect has migrated to using built-in vector types, so the original reason for separation is no longer valid. Squash the two dialects into one. The code size decrease isn't quite large: the ops originally in LLVM_AVX512 are preserved because they match LLVM IR intrinsics specialized for vector element bitwidth. However, it is still conceptually beneficial to have only one dialect. I originally considered to use Tablegen multiclasses to define both the type-polymorphic op and its two intrinsic-related instantiations, but decided against it given both the complexity of the required Tablegen input and its dissimilarity with the rest of ODS-defined ops, both potentially resulting in very poor maintainability. Depends On D98327 Reviewed By: nicolasvasilache, springerm Differential Revision: https://reviews.llvm.org/D98328	2021-03-10 13:07:26 +01:00
Alex Zinenko	0af53de369	[mlir] simplify type constraints in AVX512 dialect VectorOfLengthAndType accepts a cartesian product of given lengths and types rather than types produced by co-indexed values in the corresponding lists. Update the definitions accordingly. The type validity is already enforced by op traits. Reviewed By: nicolasvasilache, springerm Differential Revision: https://reviews.llvm.org/D98327	2021-03-10 13:07:25 +01:00
Inho Seo	2ce4caf414	Moved getStaticLoopRanges and getStaticShape methods to LinalgInterfaces.td to add static shape verification It is to use the methods in LinalgInterfaces.cpp for additional static shape verification to match the shaped operands and loop on linalgOps. If I used the existing methods, I would face circular dependency linking issue. Now we can use them as methods of LinalgOp. Reviewed By: hanchung Differential Revision: https://reviews.llvm.org/D98163	2021-03-10 04:06:22 -08:00
Christian Sigg	4d295cf5b5	[mlir] Add base class for GpuKernelToBlobPass Instead of configuring kernel-to-cubin/rocdl lowering through callbacks, introduce a base class that target-specific passes can derive from. Put the base class in GPU/Transforms, according to the discussion in D98203. The mlir-cuda-runner will go away shortly, and the mlir-rocdl-runner as well at some point. I therefore kept the existing code path working and will remove it in a separate step. Depends On D98168 Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D98279	2021-03-10 12:14:43 +01:00
Vladislav Vinogradov	f3bf5c053b	[mlir] Model MemRef memory space as Attribute Based on the following discussion: https://llvm.discourse.group/t/rfc-memref-memory-shape-as-attribute/2229 The goal of the change is to make memory space property to have more expressive representation, rather then "magic" integer values. It will allow to have more clean ASM form: ``` gpu.func @test(%arg0: memref<100xf32, "workgroup">) // instead of gpu.func @test(%arg0: memref<100xf32, 3>) ``` Explanation for `Attribute` choice instead of plain `string`: * `Attribute` classes allow to use more type safe API based on RTTI. * `Attribute` classes provides faster comparison operator based on pointer comparison in contrast to generic string comparison. * `Attribute` allows to store more complex things, like structs or dictionaries. It will allows to have more complex memory space hierarchy. This commit preserve old integer-based API and implements it on top of the new one. Depends on D97476 Reviewed By: rriddle, mehdi_amini Differential Revision: https://reviews.llvm.org/D96145	2021-03-10 12:57:27 +03:00
Hanhan Wang	d5d4fb635e	[mlir][linalg] Add support for using scalar attributes in TC ops. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D97876	2021-03-10 01:51:12 -08:00
Mehdi Amini	75f3f77805	Fix MLIR test post `890afad954`	2021-03-09 23:30:51 +00:00
Mehdi Amini	890afad954	Fix Flang build after MLIR API changes around `generatedTypeParser`	2021-03-09 23:19:30 +00:00
River Riddle	a776ecb6c2	[mlir][IR] Add an Operation::eraseOperands that supports batch erasure This method allows for removing multiple disjoint operands at once, reducing the need to erase operands individually (which results in shifting the operand list). Differential Revision: https://reviews.llvm.org/D98290	2021-03-09 15:07:53 -08:00
River Riddle	4a7aed4ee7	[mlir][IR] Add a new SymbolUserMap class This class provides efficient implementations of symbol queries related to uses, such as collecting the users of a symbol, replacing all uses, etc. This provides similar benefits to use related queries, as SymbolTableCollection did for lookup queries. Differential Revision: https://reviews.llvm.org/D98071	2021-03-09 15:07:52 -08:00
Mehdi Amini	cd9a69289c	Fix LLVM Dialect LoopOptionsAttr round-tripping: the keywords were missing in the output This indicated some missing test coverage, which are now added to the roundtrip test.	2021-03-09 22:00:22 +00:00
Mehdi Amini	fe81e8f3b5	Add default LoopOptionsAttrBuilder constructor and method to check if empty() (NFC) Also move setters out-of-line to make sure the templated helper is actually instantiated.	2021-03-09 21:12:15 +00:00
Christian Sigg	840ff84d33	[mlir] Default for gpu-binary-annotation option. Provide default for gpuBinaryAnnotation so that we don't need to specify it in tests. The annotation likely only needs to be target specific if we want to lower to e.g. both CUDA and ROCDL. Reviewed By: herhut, bondhugula Differential Revision: https://reviews.llvm.org/D98168	2021-03-09 21:01:50 +01:00
Mehdi Amini	79f736c150	Switch generatedTypeParser/generatedAttributeParser to return an OptionalParseResult This allows the caller to distinguish between a parse error or an unmatched keyword. It fixes the redundant error that was emitted by the caller when the generated parser would fail. Differential Revision: https://reviews.llvm.org/D98162	2021-03-09 19:43:45 +00:00
Mehdi Amini	8205c1a90a	Rework LLVM Dialect LoopOptions attribute Instead of storing an array of LoopOpt attributes, which were just wrapping std::pair<enum, int> anyway, we can have an attribute storing a sorted ArrayRef<std::pair<enum, int>> as a single unit. This improves here the textual format and the general API. Note that we're limiting the options to fit into an int64_t by design, but this isn't a new constraint. Building the LoopOptions attribute is likely worth a specific builder for efficient reason, that'll be the subject of a future patch. Differential Revision: https://reviews.llvm.org/D98105	2021-03-09 19:43:45 +00:00
Lei Zhang	50000abe3c	[mlir] Use affine.apply when distributing to processors This makes it easy to compose the distribution computation with other affine computations. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D98171	2021-03-09 08:37:20 -05:00
Alex Zinenko	8184247f0b	[mlir] move LLVM target import header and tests Move Target/LLVMIR.h to target/LLVMIR/Import.h to better reflect the purpose of this file. Also move all LLVM IR target tests under the LLVMIR directory. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D98178	2021-03-09 09:22:14 +01:00
Alex Zinenko	90fec5ed65	[mlir] make MLIRPresburger depend on MLIRIR The analysis library uses Location, which is defined in the MLIRIR library.	2021-03-09 09:19:53 +01:00
Vladislav Vinogradov	2241b3986c	[mlir][CMAKE] Fix cross-compilation build Use `MLIR_LINALG_ODS_GEN` and `MLIR_LINALG_ODS_YAML_GEN` variables instead of `MLIR_LINALG_ODS_GEN_EXE` and `MLIR_LINALG_ODS_YAML_GEN_EXE`. The former are defined in PARENT SCOPE only, so the `if` condition is never evaluates to `TRUE`. The logic should be the following (taken from tblgen part): 1. `TOOL_NAME` - CACHE variable (default equal to target name). User can override it to actual executable path. 2. `TOOL_NAME_EXE` - internal variable, initialized to `${TOOL_NAME}` first. In case of cross-compilation (`LLVM_USE_HOST_TOOLS == TRUE`) if user didn't set own path to native executable via `TOOL_NAME` variable, CMake will create separate targets to build native tool and will override `TOOL_NAME_EXE` to the executable produced by this target. 3. `TOOL_NAME_TARGET` - internal variable, which points to tool target name. If the native tool is built as described above, it will point to the target correspondant to that native tool. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D98025	2021-03-09 10:51:56 +03:00
Tobias Gysi	c1a4cd551f	[mlir][linalg] refactor the result handling during vectorization. Return the vectorization results using a vector passed by reference instead of returning them embedded in a structure. Differential Revision: https://reviews.llvm.org/D98182	2021-03-09 07:11:57 +00:00
Stella Laurenzo	e31c77b182	[mlir][python] Reorganize MLIR python into namespace packages. * Only leaf packages are non-namespace packages. This allows most of the top levels to be split into different directories or deployment packages. In the previous state, the presence of __init__.py files at each level meant that the entire tree could only ever exist in one physical directory on the path. * This changes the API usage slightly: `import mlir` will no longer do a deep import of `mlir.ir`, etc. This may necessitate some client code changes. * Dialect gen code was restructured so that the user is responsible for providing the `my_dialect.py` file, which then must import its peer `_my_dialect_ops_gen`. This gives complete control of the dialect namespace to the user instead of to tablegen code, allowing further dialect-specific python APIs. * Correspondingly, the previous extension modules `_my_dialect.py` are now `_my_dialect_ops_ext.py`. * Now that the `linalg` namespace is open, moved the `linalg_opdsl` tool into it. * This may require some corresponding downstream adjustments to npcomp, circt, et al: * Probably some shallow imports need to be converted to deep imports (i.e. not `import mlir` brings in the world). * Each tablegen generated dialect now needs an explicit `foo.py` which does a `from ._foo_ops_gen import `. This is similar to the way that generated code operates in the C++ world. If providing dialect op extensions, those need to be moved from `_foo.py` -> `_foo_ops_ext.py`. Differential Revision: https://reviews.llvm.org/D98096	2021-03-08 23:01:34 -08:00
Mehdi Amini	038f2a337d	Move LLVM::FMFAttr definition to TableGen (NFC) This is using the new Attribute storage generation support in TableGen to define the LLVM FastMathFlags. Differential Revision: https://reviews.llvm.org/D98007	2021-03-09 05:29:54 +00:00
River Riddle	0d01dfbc37	[mlir][IR][NFC] Move the remaining builtin types to ODS This will allow for removing the duplicated type documentation from LangRef and instead link to the builtin dialect documentation. Differential Revision: https://reviews.llvm.org/D98093	2021-03-08 14:32:40 -08:00
River Riddle	a4bb667d83	[mlir][IR][NFC] Define the Location classes in ODS instead of C++ This also removes the need for LocationDetail.h. Differential Revision: https://reviews.llvm.org/D98092	2021-03-08 14:32:40 -08:00
Rob Suderman	cb3542e1ca	[MLIR][TOSA] Added lowerings for Reduce operations to Linalg Lowerings for min, max, prod, and sum reduction operations on int and float values. This includes reduction tests for both cases. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D97893	2021-03-08 10:57:19 -08:00
Christian Sigg	7cdcb4a3b9	[mlir] NFC: Add #endif comment.	2021-03-08 19:25:24 +01:00

1 2 3 4 5 ...

6992 Commits