llvm-project

Commit Graph

Author	SHA1	Message	Date
Benjamin Kramer	bf759e3b10	[bazel] Port `7a5cb15ea6`	2022-07-26 12:53:38 +02:00
Weverything	de43f93a82	[bazel] Add new rule for `c60b897d22`	2022-07-25 20:29:01 -07:00
Alex Zinenko	333ee218ce	[mlir] Transform dialect: separate dependent and generated dialects In the Transform dialect extensions, provide the separate mechanism to declare dependent dialects (the dialects the transform IR depends on) and the generated dialects (the dialects the payload IR may be transformed into). This allows the Transform dialect clients that are only constructing the transform IR to avoid loading the dialects relevant for the payload IR along with the Transform dialect itself, thus decreasing the build/link time. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D130289	2022-07-25 09:59:53 +00:00
Benjamin Kramer	66e66117ba	[bazel] Add missing dependencies after `535b507ba5`	2022-07-23 13:25:23 +02:00
Tue Ly	d883a4ad02	[libc] Implement sinf function that is correctly rounded to all rounding modes. Implement sinf function that is correctly rounded to all rounding modes. - We use a simple range reduction for `pi/16 < \|x\|` : Let `k = round(x / pi)` and `y = (x/pi) - k`. So `k` is an integer and `-0.5 <= y <= 0.5`. Then ``` sin(x) = sin(ypi + kpi) = (-1)^(k & 1) * sin(ypi) ~ (-1)^(k & 1) y * P(y^2) ``` where `yP(y^2)` is a degree-15 minimax polynomial generated by Sollya with: ``` > P = fpminimax(sin(xpi)/x, [\|0, 2, 4, 6, 8, 10, 12, 14\|], [\|D...\|], [0, 0.5]); ``` - Performance benchmark using perf tool from CORE-MATH project (https://gitlab.inria.fr/core-math/core-math/-/tree/master) on Ryzen 1700: Before this patch (not correctly rounded): ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.892 System LIBC reciprocal throughput : 25.559 LIBC reciprocal throughput : 29.381 ``` After this patch (correctly rounded): ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.896 System LIBC reciprocal throughput : 25.740 LIBC reciprocal throughput : 27.872 LIBC reciprocal throughput : 20.012 (with `-msse4.2` flag) LIBC reciprocal throughput : 14.244 (with `-mfma` flag) ``` Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D123154	2022-07-22 10:07:31 -04:00
Augie Fackler	a4ee8a31ce	[bazel] add headers now required after `17e4c217b6`	2022-07-21 15:39:29 -04:00
Nicolas Vasilache	1f77f01c65	[mlir][Linalg] Add a Transform dialect NavigationOp op to match a list of ops or an interface. This operation is a NavigationOp that simplifies the writing of transform IR. Since there is no way of refering to an interface by name, the current implementation uses an EnumAttr and depends on the interfaces it supports. In the future, it would be worthwhile to remove this dependence and generalize. Differential Revision: https://reviews.llvm.org/D130267	2022-07-21 07:11:42 -07:00
Benjamin Kramer	439668871a	[bazel] Also add -lrt to OrcTargetProcess for `1b1f1c7786`	2022-07-20 11:28:47 +02:00
Benjamin Kramer	24c88c90a8	[bazel] Add -lrt on non-darwin/non-windows for `1b1f1c7786` For shm_open in orc jit.	2022-07-20 11:24:13 +02:00
Sriraman Tallam	16cccc66b8	Bazel BUILD file for BOLT. Differential Revision: https://reviews.llvm.org/D129899	2022-07-19 16:03:52 -07:00
Cole Kissane	e939bf67e3	[llvm] add zstd to `llvm::compression` namespace - add zstd to `llvm::compression` namespace - add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp` - debian users should install libzstd when using `LLVM_ENABLE_ZSTD=FORCE_ON` from source due to this bug https://bugs.launchpad.net/ubuntu/+source/libzstd/+bug/1941956 Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-19 10:54:36 -07:00
Benjamin Kramer	b9ad55c6d4	[bazel] Fix the build after `18b92c66fe`	2022-07-19 17:34:39 +02:00
Benjamin Kramer	9235fafd6e	[bazel] Remove libraries that don't build anymore after `5e83a5b475` I don't know who uses these python extensions, probably nobody.	2022-07-19 17:13:23 +02:00
Aart Bik	28ebb0b61d	[mlir][sparse] migrate sparse rewriting to sparse transformations pass The rules in the linalg file were very specific to sparse tensors so will find a better home under sparse tensor dialect than linalg dialect. Also moved some rewriting from sparsification into this new "pre-rewriting" file. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D129910	2022-07-18 09:29:22 -07:00
Alex Zinenko	e0fc33eba5	[mlir] Fix Bazel for `5e83a5b475` Export the __init__.py from _mlir_libs.	2022-07-18 15:35:23 +02:00
Stella Laurenzo	5e83a5b475	[mlir] Overhaul C/Python registration APIs to properly scope registration/loading activities. Since the very first commits, the Python and C MLIR APIs have had mis-placed registration/load functionality for dialects, extensions, etc. This was done pragmatically in order to get bootstrapped and then just grew in. Downstreams largely bypass and do their own thing by providing various APIs to register things they need. Meanwhile, the C++ APIs have stabilized around this and it would make sense to follow suit. The thing we have observed in canonical usage by downstreams is that each downstream tends to have native entry points that configure its installation to its preferences with one-stop APIs. This patch leans in to this approach with `RegisterEverything.h` and `mlir._mlir_libs._mlirRegisterEverything` being the one-stop entry points for the "upstream packages". The `_mlir_libs.__init__.py` now allows customization of the environment and Context by adding "initialization modules" to the `_mlir_libs` package. If present, `_mlirRegisterEverything` is treated as such a module. Others can be added by downstreams by adding a `_site_initialize_{i}.py` module, where '{i}' is a number starting with zero. The number will be incremented and corresponding module loaded until one is not found. Initialization modules can: * Perform load time customization to the global environment (i.e. registering passes, hooks, etc). * Define a `register_dialects(registry: DialectRegistry)` function that can extend the `DialectRegistry` that will be used to bootstrap the `Context`. * Define a `context_init_hook(context: Context)` function that will be added to a list of callbacks which will be invoked after dialect registration during `Context` initialization. Note that the `MLIRPythonExtension.RegisterEverything` is not included by default when building a downstream (its corresponding behavior was prior). For downstreams which need the default MLIR initialization to take place, they must add this back in to their Python CMake build just like they add their own components (i.e. to `add_mlir_python_common_capi_library` and `add_mlir_python_modules`). It is perfectly valid to not do this, in which case, only the things explicitly depended on and initialized by downstreams will be built/packaged. If the downstream has not been set up for this, it is recommended to simply add this back for the time being and pay the build time/package size cost. CMake changes: * `MLIRCAPIRegistration` -> `MLIRCAPIRegisterEverything` (renamed to signify what it does and force an evaluation: a number of places were incidentally linking this very expensive target) * `MLIRPythonSoure.Passes` removed (without replacement: just drop) * `MLIRPythonExtension.AllPassesRegistration` removed (without replacement: just drop) * `MLIRPythonExtension.Conversions` removed (without replacement: just drop) * `MLIRPythonExtension.Transforms` removed (without replacement: just drop) Header changes: * `mlir-c/Registration.h` is deleted. Dialect registration functionality is now in `IR.h`. Registration of upstream features are in `mlir-c/RegisterEverything.h`. When updating MLIR and a couple of downstreams, I found that proper usage was commingled so required making a choice vs just blind S&R. Python APIs removed: * mlir.transforms and mlir.conversions (previously only had an __init__.py which indirectly triggered `mlirRegisterTransformsPasses()` and `mlirRegisterConversionPasses()` respectively). Downstream impact: Remove these imports if present (they now happen as part of default initialization). * mlir._mlir_libs._all_passes_registration, mlir._mlir_libs._mlirTransforms, mlir._mlir_libs._mlirConversions. Downstream impact: None expected (these were internally used). C-APIs changed: * mlirRegisterAllDialects(MlirContext) now takes an MlirDialectRegistry instead. It also used to trigger loading of all dialects, which was already marked with a TODO to remove -- it no longer does, and for direct use, dialects must be explicitly loaded. Downstream impact: Direct C-API users must ensure that needed dialects are loaded or call `mlirContextLoadAllAvailableDialects(MlirContext)` to emulate the prior behavior. Also see the `ir.c` test case (e.g. ` mlirContextGetOrLoadDialect(ctx, mlirStringRefCreateFromCString("func"));`). * mlirDialectHandle* APIs were moved from Registration.h (which now is restricted to just global/upstream registration) to IR.h, arguably where it should have been. Downstream impact: include correct header (likely already doing so). C-APIs added: * mlirContextLoadAllAvailableDialects(MlirContext): Corresponds to C++ API with the same purpose. Python APIs added: * mlir.ir.DialectRegistry: Mapping for an MlirDialectRegistry. * mlir.ir.Context.append_dialect_registry(MlirDialectRegistry) * mlir.ir.Context.load_all_available_dialects() * mlir._mlir_libs._mlirAllRegistration: New native extension that exposes a `register_dialects(MlirDialectRegistry)` entry point and performs all upstream pass/conversion/transforms registration on init. In this first step, we eagerly load this as part of the __init__.py and use it to monkey patch the Context to emulate prior behavior. * Type caster and capsule support for MlirDialectRegistry This should make it possible to build downstream Python dialects that only depend on a subset of MLIR. See: https://github.com/llvm/llvm-project/issues/56037 Here is an example PR, minimally adapting IREE to these changes: https://github.com/iree-org/iree/pull/9638/files In this situation, IREE is opting to not link everything, since it is already configuring the Context to its liking. For projects that would just like to not think about it and pull in everything, add `MLIRPythonExtension.RegisterEverything` to the list of Python sources getting built, and the old behavior will continue. Reviewed By: mehdi_amini, ftynse Differential Revision: https://reviews.llvm.org/D128593	2022-07-16 17:27:50 -07:00
Tue Ly	0f782b84cb	[libc] Add nearest integer instructions to fputil. Add round to nearest integer instructions to fputil. This will be used in sinf implementation https://reviews.llvm.org/D123154 Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D129776	2022-07-14 13:20:35 -04:00
Amara Emerson	6e6be5f950	Revert "[llvm] add zstd to llvm::compression namespace" This reverts commit `d449c60076`. Breaks macOS builds with this: llvm/lib/Support/Compression.cpp:24:10: fatal error: 'zstd.h' file not found	2022-07-14 01:23:20 -07:00
Cole Kissane	d449c60076	[llvm] add zstd to llvm::compression namespace - add `FindZSTD.cmake` - add zstd to `llvm::compression` namespace - add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp` Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-13 19:58:42 -07:00
Cole Kissane	5ecb161c64	Revert "[llvm] add zstd to `llvm::compression` namespace" This reverts commit `cef07169ec`.	2022-07-13 19:48:29 -07:00
Cole Kissane	cef07169ec	[llvm] add zstd to `llvm::compression` namespace - add `FindZSTD.cmake` - add zstd to `llvm::compression` namespace - add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp` Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-13 19:06:27 -07:00
Jorge Gorbe Moya	d6071fa52d	[bazel] add missing gmock dependency to //clang/unittests:format_tests	2022-07-12 18:13:42 -07:00
Krzysztof Drewniak	d6ef3d20b4	[mlir] Remove VectorToROCDL Between issues such as https://github.com/llvm/llvm-project/issues/56323, the fact that this lowering (unlike the code in amdgpu-to-rocdl) does not correctly set up bounds checks (and thus will cause page faults on reads that might need to be padded instead), and that fixing these problems would, essentially, involve replicating amdgpu-to-rocdl, remove --vector-to-rocdl for being broken. In addition, the lowering does not support many aspects of transfer_{read,write}, like supervectors, and may not work correctly in their presence. We (the MLIR-based convolution generator at AMD) do not use this conversion pass, nor are we aware of any other clients. Migration strategies: - Use VectorToLLVM - If buffer ops are particularly needed in your application, use amdgpu.raw_buffer_{load,store} A VectorToAMDGPU pass may be introduced in the future. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D129308	2022-07-12 15:21:22 +00:00
Alex Zinenko	3963b4d0dc	[mlir] Transform op for multitile size generation Introduce a structured transform op that emits IR computing the multi-tile sizes with requested parameters (target size and divisor) for the given structured op. The sizes may fold to arithmetic constant operations when the shape is constant. These operations may then be used to call the existing tiling transformation with a single non-zero dynamic size (i.e. perform strip-mining) for each of the dimensions separately, thus achieving multi-size tiling with optional loop interchange. A separate test exercises the entire script. Depends On D129217 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D129287	2022-07-12 12:36:28 +00:00
Alex Zinenko	4e4a4c0576	[mlir] Allow Tile transform op to take dynamic sizes Extend the definition of the Tile structured transform op to enable it accepting handles to operations that produce tile sizes at runtime. This is useful by itself and prepares for more advanced tiling strategies. Note that the changes are relevant only to the transform dialect, the tiling transformation itself already supports dynamic sizes. Depends On D129216 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D129217	2022-07-12 12:21:54 +00:00
Leonard Chan	474c873148	Revert "[llvm] cmake config groundwork to have ZSTD in LLVM" This reverts commit `f07caf20b9` which seems to break upstream https://lab.llvm.org/buildbot/#/builders/109/builds/42253.	2022-07-08 13:48:05 -07:00
Cole Kissane	f07caf20b9	[llvm] cmake config groundwork to have ZSTD in LLVM - added `FindZSTD.cmake` - added a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - likewise added have_zstd to compiler-rt/test/lit.common.cfg.py, clang-tools-extra/clangd/test/lit.cfg.py, and several lit.site.cfg.py.in files mirroring have_zlib behavior Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-08 11:46:52 -07:00
Jacques Pienaar	e60cc52b79	[mlir][bzl] Update for `1a92dbcfa8` and `cab44c515c`	2022-07-07 17:36:28 -07:00
Adrian Kuegel	f066a0cd21	[llvm][Debuginfod][Bazel] Match dependencies in CMakeLists.txt. Also update llvm-config.h and llvm-config.h.cmake to match `484b1aa611` Differential Revision: https://reviews.llvm.org/D129252	2022-07-07 09:25:52 +02:00
NAKAMURA Takumi	71c9757474	[Bazel] Fixup to llvmorg-15-init-15618-ge0b520865026, s/dxil/dx/	2022-07-07 07:03:16 +09:00
Adrian Kuegel	3decc2f04d	[mlir][Bazel] Fix Bazel build after `a2158374ba`	2022-07-06 08:47:48 +02:00
Christian Sigg	3e01af093f	[mlir] Add InferIntRangeInterface to gpu.launch Infers block/grid dimensions/indices or ranges of such dimensions/indices. Reviewed By: krzysz00 Differential Revision: https://reviews.llvm.org/D129036	2022-07-05 07:14:54 +02:00
Nicolas Vasilache	7fbf55c927	[mlir][Tensor] Move ParallelInsertSlice to the tensor dialect This is moslty NFC and will allow tensor.parallel_insert_slice to gain rank-reducing semantics by reusing the vast majority of the tensor.insert_slice impl. Depends on D128857 Differential Revision: https://reviews.llvm.org/D128920	2022-07-04 01:53:12 -07:00
NAKAMURA Takumi	1ecfc12b0c	[Bazel] Make `builtin_headers_gen` as subset of CMake's `clang-resource-headers` At the moment, two files are not installed by CMake. - `lib/Headers/openmp_wrappers/time.h` - `lib/Headers/ppc_wrappers/nmmintrin.h` `builtin_headers_gen` is available as the source of rules_pkg. The difference of the layout of installed headers makes cache hit harder.	2022-07-03 15:46:38 +09:00
Arthur Eubanks	bcd153485e	[bazel] Fix invalid characters	2022-07-01 13:47:56 -07:00
Arthur Eubanks	5a65c5180e	[bazel] Port `43dc3190`, adding rules to generate dxil intrinsics	2022-07-01 13:38:43 -07:00
Mikhail Goncharov	de3fb0f29e	[fix/build] bazel rule for ParallelCombiningOpInterface	2022-07-01 10:34:15 +02:00
rdzhabarov	8e82bc840d	[fix/build] Fix bazel build rule.	2022-06-30 21:43:09 +00:00
rdzhabarov	c9be90986f	[Fix/Build] Fixing missing dependency for bazel.	2022-06-29 20:44:20 +00:00
Jacques Pienaar	701051a8c2	[mlir][shape] Switch types to ODS generated (NFC) These were already pretty simple, so just switching to generated.	2022-06-25 09:06:52 -07:00
Aart Bik	9a3d60e0d3	[mlir][bufferization][sparse] put restriction on sparse tensor allocation Putting some direct use restrictions on tensor allocations in the sparse case enables the use of simplifying assumptions in the bufferization analysis. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D128463	2022-06-24 10:58:43 -07:00
Siva Chandra Reddy	300f8da8e8	[libc] Add Uint128 type as a fallback when __uint128_t is not available. Also, the unused specializations of __int128_t have been removed. Differential Revision: https://reviews.llvm.org/D128304	2022-06-24 16:03:35 +00:00
Matthias Springer	3798678bd1	[mlir][sparse][bufferize] Implement BufferizableOpInterface Only the analysis part of the interface is implemented. The bufferization itself is performed by the SparseTensorConversion pass. Differential Revision: https://reviews.llvm.org/D128138	2022-06-24 13:47:01 +02:00
Okwan Kwon	1dd2c93a66	[mlir][linalg] move isElementwise() to Linalg/Utils (NFC) Differential Revision: https://reviews.llvm.org/D128398	2022-06-22 18:55:45 -07:00
Guillaume Chatelet	aeccc16497	Re-land [libc] Apply no-builtin everywhere, remove unnecessary flags This is a reland of D126773 / `b2a9ea4420`. The removal of `-mllvm -combiner-global-alias-analysis` has landed separately in D128051 / `7b73f53790`. And the removal of `-mllvm --tail-merge-threshold=0` is scheduled for removal in a subsequent patch.	2022-06-22 12:30:20 +00:00
Mahesh Ravishankar	2f637fe730	[mlir][TilingInterface] Enable tile and fuse using TilingInterface. This patch implements tile and fuse transformation for ops that implement the tiling interface. To do so, - `TilingInterface` needs a new method that generates a tiled implementation of the operation based on the tile of the result needed. - A pattern is added that replaces a `tensor.extract_slice` whose source is defined by an operation that implements the `TilingInterface` with a tiled implementation that produces the extracted slice in-place (using the method added to `TilingInterface`). - A pattern is added that takes a sequence of operations that implement the `TilingInterface` (for now `LinalgOp`s), tiles the consumer, and greedily fuses its producers iteratively. Differential Revision: https://reviews.llvm.org/D127809	2022-06-21 17:22:58 +00:00
Mahesh Ravishankar	c584771f54	Revert "[mlir][TilingInterface] Enable tile and fuse using TilingInterface." This reverts commit `ea75511319` due to build failures.	2022-06-21 16:56:59 +00:00
Mahesh Ravishankar	ea75511319	[mlir][TilingInterface] Enable tile and fuse using TilingInterface. This patch implements tile and fuse transformation for ops that implement the tiling interface. To do so, - `TilingInterface` needs a new method that generates a tiled implementation of the operation based on the tile of the result needed. - A pattern is added that replaces a `tensor.extract_slice` whose source is defined by an operation that implements the `TilingInterface` with a tiled implementation that produces the extracted slice in-place (using the method added to `TilingInterface`). - A pattern is added that takes a sequence of operations that implement the `TilingInterface` (for now `LinalgOp`s), tiles the consumer, and greedily fuses its producers iteratively. Differential Revision: https://reviews.llvm.org/D127809	2022-06-21 16:47:14 +00:00
Matthias Springer	858be16670	[mlir][memref] Fix layout map computation in inferRankReducedResultType Differential Revision: https://reviews.llvm.org/D128160	2022-06-21 10:08:26 +02:00
Mogball	d883a02a7c	[mlir][ods] Remove StructAttr Depends on D127373 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D127375	2022-06-21 01:10:05 +00:00
Alex Zinenko	8b68da2c7d	[mlir] move SCF headers to SCF/{IR,Transforms} respectively This aligns the SCF dialect file layout with the majority of the dialects. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D128049	2022-06-20 10:18:01 +02:00
NAKAMURA Takumi	0716b3ec09	[Bazel] Rename generated *_main.cpp to [tool-name]-driver.cpp which CMake uses. Fixup to llvmorg-15-init-12347-gf06abbb39380 The difference of basename affects its emitted object file. FIXME: Each rule's name is left as origin.	2022-06-18 13:35:23 +09:00
Benjamin Kramer	d5c29b23e1	[mlir][sparse] Inline the definition of LLVM_ATTRIBUTE_WEAK This library is supposed not to have a dependency on LLVM, and linking LLVMSupport into it breaks its shared library setup.	2022-06-17 22:41:10 +02:00
Benjamin Kramer	feb8313fbf	[bazel] Add missing dependency after 455679413 This is supposed to be header-only. Don't know how to express that in bazel.	2022-06-17 22:09:13 +02:00
Frederik Gossen	e694b22344	[mlir][nvgpu] Fix Bazel BUILD file Differential Revision: https://reviews.llvm.org/D128078	2022-06-17 14:35:30 -04:00
Christopher Bate	51b925df94	[mlir][nvgpu] shared memory access optimization pass This change adds a transformation and pass to the NvGPU dialect that attempts to optimize reads/writes from a memref representing GPU shared memory in order to avoid bank conflicts. Given a value representing a shared memory memref, it traverses all reads/writes within the parent op and, subject to suitable conditions, rewrites all last dimension index values such that element locations in the final (col) dimension are given by `newColIdx = col % vecSize + perm[row](col/vecSize,row)` where `perm` is a permutation function indexed by `row` and `vecSize` is the vector access size in elements (currently assumes 128bit vectorized accesses, but this can be made a parameter). This specific transformation can help optimize typical distributed & vectorized accesses common to loading matrix multiplication operands to/from shared memory. Differential Revision: https://reviews.llvm.org/D127457	2022-06-17 09:31:05 -06:00
Guillaume Chatelet	7b73f53790	[libc] Rely on __builtin_memcpy_inline for memcpy implementation This patch removes usage of `-mllvm -combiner-global-alias-analysis` and relies on compiler builtin to implement `memcpy`. Note that `-mllvm -combiner-global-alias-analysis` is actually only useful for functions where buffers can alias (namely `memcpy` and `memmove`). The other memory functions where not benefiting from the flag anyways. The upside is that the memory functions can now be compiled from source with thinlto (thinlto would not be able to carry on the flag when doing inlining). The downside is that for compilers other than clang (i.e. not providing `__builtin_memcpy_inline`) the codegen may be worse. Differential Revision: https://reviews.llvm.org/D128051	2022-06-17 14:22:26 +00:00
Alex Zinenko	610139d2d9	[mlir] replace 'emit_c_wrappers' func->llvm conversion option with a pass The 'emit_c_wrappers' option in the FuncToLLVM conversion requests C interface wrappers to be emitted for every builtin function in the module. While this has been useful to bootstrap the interface, it is problematic in the longer term as it may unintentionally affect the functions that should retain their existing interface, e.g., libm functions obtained by lowering math operations (see D126964 for an example). Since D77314, we have a finer-grain control over interface generation via an attribute that avoids the problem entirely. Remove the 'emit_c_wrappers' option. Introduce the '-llvm-request-c-wrappers' pass that can be run in any pipeline that needs blanket emission of functions to annotate all builtin functions with the attribute before performing the usual lowering that accounts for the attribute. Reviewed By: chelini Differential Revision: https://reviews.llvm.org/D127952	2022-06-17 11:10:31 +02:00
Guillaume Chatelet	c26366979b	[libc][bazel] Remove memcpy dependency in memmove	2022-06-17 09:07:24 +00:00
bixia1	bbb73ade43	[mlir][complex] Add Python bindings for complex ops. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D127916	2022-06-16 14:19:11 -07:00
Guillaume Chatelet	4a6929f811	Revert "[libc] Apply no-builtin everywhere, remove unnecessary flags" This reverts commit `b2a9ea4420`.	2022-06-16 09:28:17 +00:00
Jacques Pienaar	02e32708bd	[mlir][bzl] Export textmate grammar file	2022-06-15 11:28:43 -07:00
Rob Suderman	640973f2b9	[tosa] Lower tosa.slice to tensor.slice for dynamic case Existing slice lowering only supporting static shapes. Reviewed By: NatashaKnk Differential Revision: https://reviews.llvm.org/D127704	2022-06-15 09:54:36 -07:00
Thomas Raoux	6834803c3d	[mlir][vector] NFC remove dependency of VectorTransform to GPU dialect Make the reduction distribution pattern more generic and remove layering problem. The new pattern to distribute reduction is now independent of GPU and takes a lamdba to decide how the distributed reduction should be generated. Differential Revision: https://reviews.llvm.org/D127867	2022-06-15 16:08:29 +00:00
Benjamin Kramer	55b76fb58e	[bazel] Port `b0b0043209`	2022-06-14 23:55:50 +02:00
Benjamin Kramer	8224fb7ef9	[bazel] Port `75bfc6f295`	2022-06-14 23:46:02 +02:00
Mogball	ead75d9434	(Reland)[mlir] Add a generic data-flow analysis framework Removes one element of the pointer union to make it work on 32-bit systems. This patch introduces a generic data-flow analysis framework to MLIR. The framework implements a fixed-point iteration algorithm and a dependency graph between lattice states and analysis. Lattice states and points are fully extensible to support highly-customizable analyses. Reviewed By: phisiart, rriddle Differential Revision: https://reviews.llvm.org/D126751	2022-06-14 21:33:05 +00:00
Frederik Gossen	a6fa12ab3b	Revert "[mlir] Add a generic data-flow analysis framework" This reverts commit `9dea117283`. The PointerUnion assumes 3 available bits, which is not the case on 32-bit machines.	2022-06-14 17:14:27 -04:00
Mogball	9dea117283	[mlir] Add a generic data-flow analysis framework This patch introduces a generic data-flow analysis framework to MLIR. The framework implements a fixed-point iteration algorithm and a dependency graph between lattice states and analysis. Lattice states and points are fully extensible to support highly-customizable analyses. Reviewed By: phisiart, rriddle Differential Revision: https://reviews.llvm.org/D126751	2022-06-14 16:54:15 +00:00
Thomas Raoux	087aba4f0f	[mlir][vector] Add pattern to distribute vector reduction to GPU shuffles Add a pattern to do ad hoc lowering of vector.reduction to a sequence of warp shuffles. This allow distributing reduction on a warp for GPU targets. Also add an execution test for warp reduction. co-authored with @springerm Differential Revision: https://reviews.llvm.org/D127176	2022-06-14 05:49:16 +00:00
Benjamin Kramer	b8cdff8894	[bazel] Unbreak the build after `cf6a7c1947`	2022-06-14 00:23:17 +02:00
Mahesh Ravishankar	cf6a7c1947	[mlir][TilingInterface] Add pattern to tile using TilingInterface and implement TilingInterface for Linalg ops. This patch adds support for tiling operations that implement the TilingInterface. - It separates the loop constructs that are used to iterate over tile from the implementation of the tiling itself. For example, the use of destructive updates is more related to use of scf.for for iterating over tiles that are tensors. - To test the transformation, TilingInterface is implemented for LinalgOps. The separation of the looping constructs used from the implementation of tile code generation greatly simplifies the latter. - The implementation of TilingInterface for LinalgOp is kept as an external model for now till this approach can be fully flushed out to replace the existing tiling + fusion approaches in Linalg. Differential Revision: https://reviews.llvm.org/D127133	2022-06-13 20:37:44 +00:00
Mogball	e16d13322b	[mlir] (NFC) Clean up bazel and CMake target names All dialect targets in bazel have been named Dialect and all dialect targets in CMake have been named MLIRDialect.	2022-06-13 16:24:15 +00:00
Benjamin Kramer	914e30ca5c	[bazel] Change references to driver-template.cpp.in after `6bc8163c79`	2022-06-10 10:22:55 +02:00
Adrian Kuegel	61132005a9	Fix bazel BUILD.	2022-06-10 08:26:00 +02:00
Mogball	2af69c6751	[mlir][NFC] Rename Bazel target aliases and consolidate targets This patch completes outstanding TODOs of removing aliases bazel target names. This patch also renames and cosolidates some bazel targets to be more in line with their CMake counterparts, e.g. combining `:LinalgOps` and `:LinalgInterfaces` into `:LinalgDialect`. Differential Revision: https://reviews.llvm.org/D127459	2022-06-09 23:58:07 +00:00
Mogball	a31ff0af9b	[mlir][spirv] Replace StructAttrs with AttrDefs Depends on D127370 Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D127373	2022-06-09 23:16:44 +00:00
Mogball	f1182bd6d5	[mlir][tosa] Replace StructAttrs with AttrDefs Depends on D127352 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D127370	2022-06-09 23:01:51 +00:00
Mogball	d7ef488bb6	[mlir][gpu] Move GPU headers into IR/ and Transforms/ Depends on D127350 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D127352	2022-06-09 22:49:03 +00:00
Mogball	7bdd3722f2	[mlir][gpu] Change ParalellLoopMappingAttr to AttrDef It was a StructAttr. Also adds a FieldParser for AffineMap. Depends on D127348 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D127350	2022-06-09 22:23:21 +00:00
Benjamin Kramer	3ae85bd67a	[bazel] Add missing dependency after `9f1221521f`.	2022-06-09 22:41:32 +02:00
Matthias Springer	461dafd2a3	[mlir][bufferization] Add OneShotBufferize transform op This commit allows for One-Shot Bufferize to be used through the transform dialect. No op handle is currently returned for the bufferized IR. Differential Revision: https://reviews.llvm.org/D125098	2022-06-09 15:15:09 +02:00
Alex Zinenko	5f0d4f208e	[mlir] Introduce Transform ops for loops Introduce transform ops for "for" loops, in particular for peeling, software pipelining and unrolling, along with a couple of "IR navigation" ops. These ops are intended to be generalized to different kinds of loops when possible and therefore use the "loop" prefix. They currently live in the SCF dialect as there is no clear place to put transform ops that may span across several dialects, this decision is postponed until the ops actually need to handle non-SCF loops. Additionally refactor some common utilities for transform ops into trait or interface methods, and change the loop pipelining to be a returning pattern. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D127300	2022-06-09 11:41:55 +02:00
bixia1	ea8ed5cbcf	[mlir][sparse] Add F16 and BF16. This is the first PR to add `F16` and `BF16` support to the sparse codegen. There are still problems in supporting these two data types, such as `BF16` is not quite working yet. Add tests cases. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D127010	2022-06-08 09:51:05 -07:00
Tue Ly	63aa853389	[libc] Add expm1f function to bazel's build overlay. Add expm1f function to bazel's build overlay. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D127298	2022-06-08 09:49:47 -04:00
Reid Kleckner	570e76bb6c	[config] Remove vestigial LLVM_VERSION_INFO This has been superseded by the llvm/Support/VCSRevision.h header. So far as I can tell, nothing in the CMake build sets LLVM_VERSION_INFO. It was always undefined, and the ifdefs using it were dead. However, CMake is very flexible, so it's possible that I missed some ways to set this variable. One could, for example, probably pass -DLLVM_VERSION_INFO=x on the command line and get that through to configure_file, or set the variable in an obscure way (`set(${proj}_VERSION_INFO "x")`). I'm reasonably confident that isn't happening, but I'd like a second opinion. Update the Bazel and gn builds accordingly. Differential Revision: https://reviews.llvm.org/D126977	2022-06-07 11:36:26 -07:00
Reid Kleckner	b1c7889f32	[config] Remove RETSIGTYPE from config.h.cmake, NFC This doesn't need to be configurable. It was hardcoded to void in all LLVM build systems.	2022-06-07 11:35:25 -07:00
Mogball	5cac7cda95	[mlir][bazel] fix bazel build on VectorTransforms	2022-06-06 21:51:17 +00:00
Chris Bieneman	f06abbb393	LLVM Driver Multicall tool This patch adds an llvm-driver multicall tool that can combine multiple LLVM-based tools. The build infrastructure is enabled for a tool by adding the GENERATE_DRIVER option to the add_llvm_executable CMake call, and changing the tool's main function to a canonicalized tool_name_main format (i.e. llvm_ar_main, clang_main, etc...). As currently implemented llvm-driver contains dsymutil, llvm-ar, llvm-cxxfilt, llvm-objcopy, and clang (if clang is included in the build). llvm-driver can be enabled from builds by setting LLVM_TOOL_LLVM_DRIVER_BUILD=On. There are several limitations in the current implementation, which can be addressed in subsequent patches: (1) the multicall binary cannot currently properly handle multi-dispatch tools. This means symlinking llvm-ranlib to llvm-driver will not properly result in llvm-ar's main being called. (2) the multicall binary cannot be comprised of tools containing conflicting cl::opt options as the global cl::opt option list cannot contain duplicates. These limitations can be addressed in subsequent patches. Differential revision: https://reviews.llvm.org/D109977	2022-06-06 04:27:32 +00:00
Christian Sigg	400fef081a	Recommit: "[MLIR][NVVM] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration." This change rolls `bcfc0a9051` forward (i.e., reverting `369ce54bb3`) with fixed CMakeLists.txt.	2022-06-05 09:11:43 +02:00
Mehdi Amini	369ce54bb3	Revert "[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration." This reverts commit `bcfc0a9051`. The build is broken with shared library enabled.	2022-06-04 08:35:45 +00:00
Christian Sigg	bcfc0a9051	[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration. This is correct for all values, i.e. the same as promoting the division to fp32 in the NVPTX backend. But it is faster (~10% in average, sometimes more) because: - it performs less Newton iterations - it avoids the slow path for e.g. denormals - it allows reuse of the reciprocal for multiple divisions by the same divisor Test program: ``` #include <stdio.h> #include "cuda_fp16.h" // This is a variant of CUDA's own __hdiv which is fast than hdiv_promote below // and doesn't suffer from the perf cliff of div.rn.fp32 with 'special' values. __device__ half hdiv_newton(half a, half b) { float fa = __half2float(a); float fb = __half2float(b); float rcp; asm("{rcp.approx.ftz.f32 %0, %1;\n}" : "=f"(rcp) : "f"(fb)); float result = fa * rcp; auto exponent = reinterpret_cast<const unsigned&>(result) & 0x7f800000; if (exponent != 0 && exponent != 0x7f800000) { float err = __fmaf_rn(-fb, result, fa); result = __fmaf_rn(rcp, err, result); } return __float2half(result); } // Surprisingly, this is faster than CUDA's own __hdiv. __device__ half hdiv_promote(half a, half b) { return __float2half(__half2float(a) / __half2float(b)); } // This is an approximation that is accurate up to 1 ulp. __device__ half hdiv_approx(half a, half b) { float fa = __half2float(a); float fb = __half2float(b); float result; asm("{div.approx.ftz.f32 %0, %1, %2;\n}" : "=f"(result) : "f"(fa), "f"(fb)); return __float2half(result); } __global__ void CheckCorrectness() { int i = threadIdx.x + blockIdx.x * blockDim.x; half x = reinterpret_cast<const half&>(i); for (int j = 0; j < 65536; ++j) { half y = reinterpret_cast<const half&>(j); half d1 = hdiv_newton(x, y); half d2 = hdiv_promote(x, y); auto s1 = reinterpret_cast<const short&>(d1); auto s2 = reinterpret_cast<const short&>(d2); if (s1 != s2) { printf("%f (%u) / %f (%u), got %f (%hu), expected: %f (%hu)\n", __half2float(x), i, __half2float(y), j, __half2float(d1), s1, __half2float(d2), s2); //__trap(); } } } __device__ half dst; __global__ void ProfileBuiltin(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = x / x; } dst = x; } __global__ void ProfilePromote(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_promote(x, x); } dst = x; } __global__ void ProfileNewton(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_newton(x, x); } dst = x; } __global__ void ProfileApprox(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_approx(x, x); } dst = x; } int main() { CheckCorrectness<<<256, 256>>>(); half one = __float2half(1.0f); ProfileBuiltin<<<1, 1>>>(one); // 1.001s ProfilePromote<<<1, 1>>>(one); // 0.560s ProfileNewton<<<1, 1>>>(one); // 0.508s ProfileApprox<<<1, 1>>>(one); // 0.304s auto status = cudaDeviceSynchronize(); printf("%s\n", cudaGetErrorString(status)); } ``` Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D126158	2022-06-04 08:03:29 +02:00
Reid Kleckner	d82b4fe50d	[bazel] Update build for config.h.cmake change	2022-06-03 12:58:04 -07:00
Benjamin Kramer	389c0b81d3	[bazel] Port `95aff23e29`	2022-06-03 20:43:23 +02:00
Benjamin Kramer	d9de52819d	[bazel] Add a missing dependency after `f3bdb56d61`	2022-06-01 22:52:06 +02:00
Stella Laurenzo	3bb7999339	[mlir] Add global_load and global_store ops to ml_program. * Adds simple, non-atomic, non-volatile, non-synchronized direct load/store ops. Differential Revision: https://reviews.llvm.org/D126230	2022-06-01 11:32:15 -07:00
Guillaume Chatelet	ffa479a452	[libc] fix typo in BUILD.bazel feature	2022-06-01 13:53:36 +00:00
Guillaume Chatelet	b2a9ea4420	[libc] Apply no-builtin everywhere, remove unnecessary flags Note, this is a re-submission of D125894 with `features = ["-header_modules"]` added to the main BUILD.bazel file. Some functions like `stpncpy` are implemented in terms of `memset` but are not currently using `-fno-builtin-memset`. This is somewhat hidden by the fact that we use `-ffreestanding` globally and that `-ffreestanding` implies `-fno-builtin` for Clang. This patch also removes `-mllvm -combiner-global-alias-analysis` that is Clang specific and that does not bring substantial gains on modern processors. Also we keep `-mllvm --tail-merge-threshold=0` for aarch64 in CMakeLists.txt but we omit it in the Bazel config. This is because Bazel consumes the source files directly and so it can use PGO to take optimal decisions locally. Differential Revision: https://reviews.llvm.org/D126773	2022-06-01 13:34:36 +00:00
Christian Sigg	f330db8b14	Fix bazel build after `59b273a166`.	2022-06-01 13:13:18 +02:00
Christian Sigg	7cb8b973fa	Fix bazel build after `59b273a166`. Reviewed By: tpopp Differential Revision: https://reviews.llvm.org/D126765	2022-06-01 12:12:04 +02:00
Reid Kleckner	17296607a7	Revert "[Bazel][GN] Reuse the GN LLVM config file generation code" This reverts commit `e2ee8bf981`. This change is beyond my ability to integrate into Google's internal build configuration tonight.	2022-05-31 21:15:46 -07:00
Reid Kleckner	e2ee8bf981	[Bazel][GN] Reuse the GN LLVM config file generation code Currently, the Bazel build uses static, checked in [llvm-]config.h files in combination with global macro definitions to mimic CMake's generated headers. This change reuses the write_cmake_config.py script from the GN build to generate the headers from source in the same way. The purpose is to ensure that the Bazel build stays up to date with any changes to the CMake config files. The write_cmake_config.py script has good error checking to ensure that unneeded, stale variables are not passed, and that any missing variables are reported as errors. I tried to closely follow the logic in the GN build here: llvm/utils/gn/secondary/llvm/include/Config/BUILD.gn The duplication between this file and config.bzl is significant, and we could consider going further, but I'd like to hold off on it for now. The GN build changes are to move the write_cmake_config.py script up to //llvm/utils/write_cmake_config.py, and update the paths accordingly. The next logical change is to generate Clang's config.h header. Differential Revision: https://reviews.llvm.org/D126581	2022-05-31 19:40:05 -07:00
Reid Kleckner	3c31c68c90	[Bazel] Add missing dep after mlgo test change `f46dd19b48`	2022-05-31 19:39:42 -07:00
Fangrui Song	da9d41cb87	[Bazel] Fix typo: startlark=>starlark	2022-05-31 14:12:41 -07:00
Adrian Kuegel	c7bee26f4f	[mlir][Bazel] Adjust BUILD.bazel file	2022-05-31 14:04:07 +02:00
Benjamin Kramer	110a20b70e	[bazel] Port `42c17073fc`	2022-05-31 13:52:05 +02:00
Mikhail Goncharov	d861088024	Fix bazel build After `1c2edb026e`	2022-05-31 10:15:39 +02:00
Alex Zinenko	3f71765a71	[mlir] provide Python bindings for the Transform dialect Python bindings for extensions of the Transform dialect are defined in separate Python source files that can be imported on-demand, i.e., that are not imported with the "main" transform dialect. This requires a minor addition to the ODS-based bindings generator. This approach is consistent with the current model for downstream projects that are expected to bundle MLIR Python bindings: such projects can include their custom extensions into the bundle similarly to how they include their dialects. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D126208	2022-05-30 17:37:52 +02:00
Alex Zinenko	cc6c159203	[mlir] add VectorizeOp to structured transform ops Vectorization is a key transformation to achieve high performance on most architectures. In the transform dialect, vectorization is implemented as a parameterizable transform op. It currently applies to a scope of payload IR delimited by some isolated-from-above op, mainly because several enabling transformations (such as affine simplification) are needed to perform vectorization and these transformation would apply to ops other than the "main" computational payload op. A separate "navigation" transform op that obtains the isolated-from-above ancestor of an op is introduced in the core transform dialect. Even though it is currently only useful for vectorization, isolated-from-above ops are a common anchor for transformations (usually implemented as passes) that is likely to be reused in the future. Depends On D126374 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D126542	2022-05-30 17:37:50 +02:00
NAKAMURA Takumi	de20fb72ad	[bazel] BLAKE3: Adopt aarch64 and x86_64. FIXME: arm(32) may be applicable here. I haven't tested yet. Differential Revision: https://reviews.llvm.org/D126543	2022-05-28 07:05:30 +09:00
Daniele Vettorel	b479ea4b0a	Add llvm-debuginfod-find tool to Bazel build Add missing `llvm-debuginfod-find` tool to the Bazel build. Patch by: vettoreldaniele. Reviewed By: GMNGeoffrey Differential Revision: https://reviews.llvm.org/D126489	2022-05-27 10:22:44 -04:00
wren romano	2046e11ac4	[mlir][sparse] Improving ExecutionEngine/SparseTensorUtils.h This change makes the public API of SparseTensorUtils.cpp explicit, whereas before the publicity of these functions was only implicit. Implicit publicity is sufficient for mlir-opt to generate calls to these functions, but it's not enough to enable C/C++ code to call them directly in the usual way (i.e., without going through codegen). Thus, leaving the publicity implicit prevents development of other tools (e.g., microbenchmarks). In addition this change also marks the functions MLIR_CRUNNERUTILS_EXPORT, which is required by the JIT under certain configurations (albeit not for anything in our test suite). Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D126105	2022-05-26 17:22:08 -07:00
NAKAMURA Takumi	6f434776da	[bazel] Introduce "VE" CodeGen in LLVM.	2022-05-26 22:39:49 +09:00
Alex Zinenko	73c3dff1b3	[mlir] Use-after-free checker for the Transform dialect The Transform dialect uses the side effect modeling mechanism to record the effects of the transform ops on the mapping between Transform IR values and Payload IR ops. Introduce a checker pass that warns if a Transform IR value is used after it has been freed (consumed). This pass is mostly intended as a debugging aid in addition to the verification/assertion mechanisms in the transform interpreter. It reports all potential use-after-free situations. The implementation makes a series of simplifying assumptions to be simple and conservative. A more advanced implementation would rely on the data flow-like analysis associated with a side-effect resource rather than a value, which is currently not supported by the analysis infrastructure. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D126381	2022-05-26 12:28:41 +02:00
NAKAMURA Takumi	65ab6b495a	[bazel] Unset REVISION as if LLVM_APPEND_VC_REV=OFF, for now. We could implement retrieving the revision here, but we may avoid "Just the same but only different revision hash string".	2022-05-26 06:24:49 +09:00
NAKAMURA Takumi	801ac2ebf1	[bazel] Bump to 15.0.0git	2022-05-26 06:24:49 +09:00
Matthias Springer	210c4e7fc8	[mlir][bufferization] Fix Python bindings Differential Revision: https://reviews.llvm.org/D126179	2022-05-23 18:12:56 +02:00
Matthias Springer	ffdbecccaf	[mlir][bufferization] Add bufferization.alloc_tensor op This change adds a new op `alloc_tensor` to the bufferization dialect. During bufferization, this op is always lowered to a buffer allocation (unless it is "eliminated" by a pre-processing pass). It is useful to have such an op in tensor land, because it allows users to model tensor SSA use-def chains (which drive bufferization decisions) and because tensor SSA use-def chains can be analyzed by One-Shot Bufferize, while memref values cannot. This change also replaces all uses of linalg.init_tensor in bufferization-related code with bufferization.alloc_tensor. linalg.init_tensor and bufferization.alloc_tensor are similar, but the purpose of the former one is just to carry a shape. It does not indicate a memory allocation. linalg.init_tensor is not suitable for modelling SSA use-def chains for bufferization purposes, because linalg.init_tensor is marked as not having side effects (in contrast to alloc_tensor). As such, it is legal to move linalg.init_tensor ops around/CSE them/etc. This is not desirable for alloc_tensor; it represents an explicit buffer allocation while still in tensor land and such allocations should not suddenly disappear or get moved around when running the canonicalizer/CSE/etc. BEGIN_PUBLIC No public commit message needed for presubmit. END_PUBLIC Differential Revision: https://reviews.llvm.org/D126003	2022-05-21 02:47:32 +02:00
Dmitri Gribenko	30628b0ecc	Use the public clang::Builtin API in the unit test	2022-05-20 18:09:09 +02:00
Dmitri Gribenko	cf31db35a7	Adjust BUILD files for [ifs] Switch to using OptTable	2022-05-20 18:08:29 +02:00
Dmitri Gribenko	73bd60b843	Adjust BUILD files for [MLIR][GPU] Add NvGpu mma.sync path to the VectorToGPU pass	2022-05-20 18:03:20 +02:00
Aart Bik	28b6d412af	[mlir][sparse] add support for complex zero/one building Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D126039	2022-05-20 08:53:30 -07:00
Guillaume Chatelet	0443bfabe7	Revert "[libc] Apply no-builtin everywhere, remove unnecessary flags" This reverts commit `94d6dd9057`.	2022-05-20 14:37:17 +00:00
Alex Brachet	c3856cb739	[bazel][libc] Fix bazel build Differential revision: https://reviews.llvm.org/D126028	2022-05-19 22:58:50 +00:00
Jorge Gorbe Moya	221b7a4583	[bazel] Add lib/Basic/BuiltinTargetFeatures.h to clang:basic `hdrs`. This header is included by clang/unittests/CodeGen/CheckTargetFeaturesTest.cpp so it needs to be exposed here to make it visible.	2022-05-19 14:20:17 -07:00
Guillaume Chatelet	94d6dd9057	[libc] Apply no-builtin everywhere, remove unnecessary flags Some functions like `stpncpy` are implemented in terms of `memset` but are not currently using `-fno-builtin-memset`. This is somewhat hidden by the fact that we use `-ffreestanding` globally and that `-ffreestanding` implies `-fno-builtin` for Clang. This patch also removes `-mllvm -combiner-global-alias-analysis` that is Clang specific and that does not bring substantial gains on modern processors. Also we keep `-mllvm --tail-merge-threshold=0` for aarch64 in CMakeLists.txt but we omit it in the Bazel config. This is because Bazel consumes the source files directly and so it can use PGO to take optimal decisions locally. Differential Revision: https://reviews.llvm.org/D125894	2022-05-19 09:08:42 +00:00
Stella Laurenzo	8b7e85f4f8	[mlir][python] Add Python bindings for ml_program dialect. Differential Revision: https://reviews.llvm.org/D125852	2022-05-18 23:08:33 -07:00
Stella Laurenzo	2bb252852c	[mlir] Add GlobalOp, GlobalLoadConstOp to ml_program. The approach I took was to define a dialect 'extern' attribute that a GlobalOp can take as a value to signify external linkage. I think this approach should compose well and should also work with wherever the OpaqueElements work goes in the future (since that is just another kind of attribute). I special cased the GlobalOp parser/printer for this case because it is significantly easier on the eyes. In the discussion, Jeff Niu had proposed an alternative syntax for GlobalOp that I ended up not taking. I did try to implement it but a) I don't think it made anything easier to read in the common case, and b) it made the parsing/printing logic a lot more complicated (I think I would need a completely custom parser/printer to do it well). Please have a look at the common cases where the global type and initial value type match: I don't think how I have it is too bad. The less common cases seem ok to me. I chose to only implement the direct, constant load op since that is non side effecting and there was still discussion pending on that. Differential Revision: https://reviews.llvm.org/D124318	2022-05-18 23:08:28 -07:00
Benjamin Kramer	e497871356	[mlir][complex] Add pow/sqrt/tanh ops and lowering to libm Lowering through libm gives us a baseline version, even though it's not going to be particularly fast. This is similar to what we do for some math dialect ops. Differential Revision: https://reviews.llvm.org/D125550	2022-05-18 14:03:14 +02:00
Aart Bik	736c1b66ef	[mlir][sparse] introduce complex type to sparse tensor support This is the first implementation of complex (f64 and f32) support in the sparse compiler, with complex add/mul as first operations. Note that various features are still TBD, such as other ops, and reading in complex values from file. Also, note that the std::complex<float> had a bit of an ABI issue when passed as single argument. It is still TBD if better solutions are possible. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D125596	2022-05-16 13:17:36 -07:00
Benjamin Kramer	cc88212d81	[bazel] Port `ae8bbc43f4`	2022-05-14 12:12:20 +02:00
Tres Popp	1dce51b888	[mlir] Add TensorToLinalgPass This pass is to handle computationally complex operations like tensor.pad which are not simply lowered to the exact same operation in the memref dialect. Differential Revision: https://reviews.llvm.org/D125384	2022-05-13 12:17:22 +02:00
Michael Jones	dd7f30464b	[libc] fix uint includes and libc bazel This patch fixes the includes for the new UInt class so that the api test now passes, additionally it fixes the bazel files to account for the new dependencies. Differential Revision: https://reviews.llvm.org/D125490	2022-05-12 11:40:52 -07:00
Thomas Raoux	d02f10d96d	[mlir][vector] Add lowering pattern for vector.warp_execute_on_lane_0 op Add lowering of the vector.warp_execute_on_lane_0 into scf.if plus memory transfer for the operands and yield values. This also add an integration test running on GPU warp. The same tests can be later re-used with different comment lines to tests distribution transformations. This is mostly from @springerm contribution. Differential Revision: https://reviews.llvm.org/D125430	2022-05-12 13:27:43 +00:00
Benjamin Kramer	303638248a	[mlir][linalg] Add lowering of named ops on complex numbers This lets linalg.dot and friends lower to a complex muladd using ops from the complex dialect. Differential Revision: https://reviews.llvm.org/D125461	2022-05-12 13:37:34 +02:00
Benjamin Kramer	ca6cbbe8d0	[bazel] Add support for configuring the bazel build for PPC TF already carries a patch for this.	2022-05-12 12:04:14 +02:00
Vibhuti Sawant	6b6e796b74	[Bazel] Add support for s390x build target While executing the test suite for Tensorflow(v2.8.0), we encountered multiple TC failures with the below error ``` 'z14' is not a recognized processor for this target ``` This patch adds the s390x target to the build target list. It fixes TC failures in multiple modules of Tensorflow on s390x arch. It is also tested to have no effect on x86 machines. Reviewed By: GMNGeoffrey Differential Revision: https://reviews.llvm.org/D125096	2022-05-11 09:23:16 -07:00
Thomas Raoux	15bcc36eed	[mlir][gpu] Move async copy ops to NVGPU and add caching hints Move async copy operations to NVGPU as they only exist on NV target and are designed to match ptx semantic. This allows us to also add more fine grain caching hint attribute to the op. Add hint to bypass L1 and hook it up to NVVM op. Differential Revision: https://reviews.llvm.org/D125244	2022-05-10 22:30:24 +00:00
Krzysztof Drewniak	814b605095	[mlir][AMDGPU] Add AMDGPU conversion patterns to ConvertGPUToROCDL This ensures that attributes such as the index bitwidth propagate correctly to the AMDGPUToROCDL patterns. Differential Revision: https://reviews.llvm.org/D125320	2022-05-10 16:49:11 +00:00
Krzysztof Drewniak	f1f05a91ca	[MLIR][AMDGPU] Add AMDGPU dialect, wrappers around raw buffer intrinsics By analogy with the NVGPU dialect, introduce an AMDGPU dialect for AMD-specific intrinsic wrappers. The dialect initially includes wrappers around the raw buffer intrinsics. On AMD GPUs, a memref can be converted to a "buffer descriptor" that allows more precise control of memory access, such as by allowing for out of bounds loads/stores to be replaced by 0/ignored without adding additional conditional logic, which is important for performance. The repository currently contains a limited conversion from transfer_read/transfer_write to Mubuf intrinsics, which are an older, deprecated intrinsic for the same functionality. The new amdgpu.raw_buffer_* ops allow these operations to be used explicitly and for including metadata such as whether the target chipset is an RDNA chip or not (which impacts the interpretation of some bits in the buffer descriptor), while still maintaining an MLIR-like interface. (This change also exposes the floating-point atomic add intrinsic.) Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D122765	2022-05-10 14:59:58 +00:00
Benjamin Kramer	9eccc7357e	[bazel] Fix the build after `2c33266084`	2022-05-03 23:04:10 +02:00
Eric Li	aaddfbf9d6	[bazel] Add test targets for dataflow framework Differential Revision: https://reviews.llvm.org/D124819	2022-05-03 14:09:10 +00:00
Matthias Springer	3c2a74a3ae	[mlir][linalg][transform] Add TileOp to transform dialect This commit adds a tiling op to the transform dialect as an external op. Differential Revision: https://reviews.llvm.org/D124661	2022-04-29 21:35:31 +09:00
Benjamin Kramer	b8d4fe0f0a	[bazel] Port `92a836da07`	2022-04-28 22:51:27 +02:00
Benjamin Kramer	1fbdf3a02e	[bazel] Port `84fe39a45b`	2022-04-28 18:29:43 +02:00
Stephan Herhut	c10bbc20bc	[mlir][bazel] Add suport for PDLL tests. Differential Revision: https://reviews.llvm.org/D124515	2022-04-27 12:35:12 +02:00
Adrian Kuegel	c2a8490193	[mlir][Bazel] Add missing dependencies. When building with layering_check enabled, there needs to be a dependency for each header include.	2022-04-25 09:24:07 +02:00
Fangrui Song	bbbc49f780	[Bazel] Add more mlir dependencies after D124298 The Bazel layering_check feature compiles libraries with `-fmodule-name=X -fmodules-strict-decluse` which require #include to be in deps.	2022-04-23 13:06:54 -07:00
Fangrui Song	6c8612fe6f	[Bazel] Make mlir:BufferizationDialect depend on mlir:ArithmeticDialect after D124298	2022-04-23 12:59:37 -07:00
Alex Zinenko	40a8bd635b	[mlir] use side effects in the Transform dialect Currently, the sequence of Transform dialect operations only supports a single use of each operand (verified by the `transform.sequence` operation). This was originally motivated by the need to guard against accessing a payload IR operation associated with a transform IR value after this operation has likely been rewritten by a transformation. However, not all Transform dialect operations rewrite payload IR, in particular the "navigation" operation such as `transform.pdl_match` do not. Introduce memory effects to the Transform dialect operations to describe their effect on the payload IR and the mapping between payload IR opreations and transform IR values. Use these effects to replace the single-use rule, allowing repeated reads and disallowing use-after-free, where operations with the "free" effect are considered to "consume" the transform IR value and rewrite the corresponding payload IR operations). As an additional improvement, this enables code motion transformation on the transform IR itself. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D124181	2022-04-22 23:29:11 +02:00

1 2 3 4 5 ...

642 Commits