llvm-project

Commit Graph

Author	SHA1	Message	Date
Christian Sigg	400fef081a	Recommit: "[MLIR][NVVM] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration." This change rolls `bcfc0a9051` forward (i.e., reverting `369ce54bb3`) with fixed CMakeLists.txt.	2022-06-05 09:11:43 +02:00
Mehdi Amini	369ce54bb3	Revert "[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration." This reverts commit `bcfc0a9051`. The build is broken with shared library enabled.	2022-06-04 08:35:45 +00:00
Christian Sigg	bcfc0a9051	[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration. This is correct for all values, i.e. the same as promoting the division to fp32 in the NVPTX backend. But it is faster (~10% in average, sometimes more) because: - it performs less Newton iterations - it avoids the slow path for e.g. denormals - it allows reuse of the reciprocal for multiple divisions by the same divisor Test program: ``` #include <stdio.h> #include "cuda_fp16.h" // This is a variant of CUDA's own __hdiv which is fast than hdiv_promote below // and doesn't suffer from the perf cliff of div.rn.fp32 with 'special' values. __device__ half hdiv_newton(half a, half b) { float fa = __half2float(a); float fb = __half2float(b); float rcp; asm("{rcp.approx.ftz.f32 %0, %1;\n}" : "=f"(rcp) : "f"(fb)); float result = fa * rcp; auto exponent = reinterpret_cast<const unsigned&>(result) & 0x7f800000; if (exponent != 0 && exponent != 0x7f800000) { float err = __fmaf_rn(-fb, result, fa); result = __fmaf_rn(rcp, err, result); } return __float2half(result); } // Surprisingly, this is faster than CUDA's own __hdiv. __device__ half hdiv_promote(half a, half b) { return __float2half(__half2float(a) / __half2float(b)); } // This is an approximation that is accurate up to 1 ulp. __device__ half hdiv_approx(half a, half b) { float fa = __half2float(a); float fb = __half2float(b); float result; asm("{div.approx.ftz.f32 %0, %1, %2;\n}" : "=f"(result) : "f"(fa), "f"(fb)); return __float2half(result); } __global__ void CheckCorrectness() { int i = threadIdx.x + blockIdx.x * blockDim.x; half x = reinterpret_cast<const half&>(i); for (int j = 0; j < 65536; ++j) { half y = reinterpret_cast<const half&>(j); half d1 = hdiv_newton(x, y); half d2 = hdiv_promote(x, y); auto s1 = reinterpret_cast<const short&>(d1); auto s2 = reinterpret_cast<const short&>(d2); if (s1 != s2) { printf("%f (%u) / %f (%u), got %f (%hu), expected: %f (%hu)\n", __half2float(x), i, __half2float(y), j, __half2float(d1), s1, __half2float(d2), s2); //__trap(); } } } __device__ half dst; __global__ void ProfileBuiltin(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = x / x; } dst = x; } __global__ void ProfilePromote(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_promote(x, x); } dst = x; } __global__ void ProfileNewton(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_newton(x, x); } dst = x; } __global__ void ProfileApprox(half x) { #pragma unroll 1 for (int i = 0; i < 10000000; ++i) { x = hdiv_approx(x, x); } dst = x; } int main() { CheckCorrectness<<<256, 256>>>(); half one = __float2half(1.0f); ProfileBuiltin<<<1, 1>>>(one); // 1.001s ProfilePromote<<<1, 1>>>(one); // 0.560s ProfileNewton<<<1, 1>>>(one); // 0.508s ProfileApprox<<<1, 1>>>(one); // 0.304s auto status = cudaDeviceSynchronize(); printf("%s\n", cudaGetErrorString(status)); } ``` Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D126158	2022-06-04 08:03:29 +02:00
Reid Kleckner	d82b4fe50d	[bazel] Update build for config.h.cmake change	2022-06-03 12:58:04 -07:00
Benjamin Kramer	389c0b81d3	[bazel] Port `95aff23e29`	2022-06-03 20:43:23 +02:00
Benjamin Kramer	d9de52819d	[bazel] Add a missing dependency after `f3bdb56d61`	2022-06-01 22:52:06 +02:00
Stella Laurenzo	3bb7999339	[mlir] Add global_load and global_store ops to ml_program. * Adds simple, non-atomic, non-volatile, non-synchronized direct load/store ops. Differential Revision: https://reviews.llvm.org/D126230	2022-06-01 11:32:15 -07:00
Guillaume Chatelet	ffa479a452	[libc] fix typo in BUILD.bazel feature	2022-06-01 13:53:36 +00:00
Guillaume Chatelet	b2a9ea4420	[libc] Apply no-builtin everywhere, remove unnecessary flags Note, this is a re-submission of D125894 with `features = ["-header_modules"]` added to the main BUILD.bazel file. Some functions like `stpncpy` are implemented in terms of `memset` but are not currently using `-fno-builtin-memset`. This is somewhat hidden by the fact that we use `-ffreestanding` globally and that `-ffreestanding` implies `-fno-builtin` for Clang. This patch also removes `-mllvm -combiner-global-alias-analysis` that is Clang specific and that does not bring substantial gains on modern processors. Also we keep `-mllvm --tail-merge-threshold=0` for aarch64 in CMakeLists.txt but we omit it in the Bazel config. This is because Bazel consumes the source files directly and so it can use PGO to take optimal decisions locally. Differential Revision: https://reviews.llvm.org/D126773	2022-06-01 13:34:36 +00:00
Christian Sigg	f330db8b14	Fix bazel build after `59b273a166`.	2022-06-01 13:13:18 +02:00
Christian Sigg	7cb8b973fa	Fix bazel build after `59b273a166`. Reviewed By: tpopp Differential Revision: https://reviews.llvm.org/D126765	2022-06-01 12:12:04 +02:00
Reid Kleckner	17296607a7	Revert "[Bazel][GN] Reuse the GN LLVM config file generation code" This reverts commit `e2ee8bf981`. This change is beyond my ability to integrate into Google's internal build configuration tonight.	2022-05-31 21:15:46 -07:00
Reid Kleckner	e2ee8bf981	[Bazel][GN] Reuse the GN LLVM config file generation code Currently, the Bazel build uses static, checked in [llvm-]config.h files in combination with global macro definitions to mimic CMake's generated headers. This change reuses the write_cmake_config.py script from the GN build to generate the headers from source in the same way. The purpose is to ensure that the Bazel build stays up to date with any changes to the CMake config files. The write_cmake_config.py script has good error checking to ensure that unneeded, stale variables are not passed, and that any missing variables are reported as errors. I tried to closely follow the logic in the GN build here: llvm/utils/gn/secondary/llvm/include/Config/BUILD.gn The duplication between this file and config.bzl is significant, and we could consider going further, but I'd like to hold off on it for now. The GN build changes are to move the write_cmake_config.py script up to //llvm/utils/write_cmake_config.py, and update the paths accordingly. The next logical change is to generate Clang's config.h header. Differential Revision: https://reviews.llvm.org/D126581	2022-05-31 19:40:05 -07:00
Reid Kleckner	3c31c68c90	[Bazel] Add missing dep after mlgo test change `f46dd19b48`	2022-05-31 19:39:42 -07:00
Fangrui Song	da9d41cb87	[Bazel] Fix typo: startlark=>starlark	2022-05-31 14:12:41 -07:00
Adrian Kuegel	c7bee26f4f	[mlir][Bazel] Adjust BUILD.bazel file	2022-05-31 14:04:07 +02:00
Benjamin Kramer	110a20b70e	[bazel] Port `42c17073fc`	2022-05-31 13:52:05 +02:00
Mikhail Goncharov	d861088024	Fix bazel build After `1c2edb026e`	2022-05-31 10:15:39 +02:00
Alex Zinenko	3f71765a71	[mlir] provide Python bindings for the Transform dialect Python bindings for extensions of the Transform dialect are defined in separate Python source files that can be imported on-demand, i.e., that are not imported with the "main" transform dialect. This requires a minor addition to the ODS-based bindings generator. This approach is consistent with the current model for downstream projects that are expected to bundle MLIR Python bindings: such projects can include their custom extensions into the bundle similarly to how they include their dialects. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D126208	2022-05-30 17:37:52 +02:00
Alex Zinenko	cc6c159203	[mlir] add VectorizeOp to structured transform ops Vectorization is a key transformation to achieve high performance on most architectures. In the transform dialect, vectorization is implemented as a parameterizable transform op. It currently applies to a scope of payload IR delimited by some isolated-from-above op, mainly because several enabling transformations (such as affine simplification) are needed to perform vectorization and these transformation would apply to ops other than the "main" computational payload op. A separate "navigation" transform op that obtains the isolated-from-above ancestor of an op is introduced in the core transform dialect. Even though it is currently only useful for vectorization, isolated-from-above ops are a common anchor for transformations (usually implemented as passes) that is likely to be reused in the future. Depends On D126374 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D126542	2022-05-30 17:37:50 +02:00
NAKAMURA Takumi	de20fb72ad	[bazel] BLAKE3: Adopt aarch64 and x86_64. FIXME: arm(32) may be applicable here. I haven't tested yet. Differential Revision: https://reviews.llvm.org/D126543	2022-05-28 07:05:30 +09:00
Daniele Vettorel	b479ea4b0a	Add llvm-debuginfod-find tool to Bazel build Add missing `llvm-debuginfod-find` tool to the Bazel build. Patch by: vettoreldaniele. Reviewed By: GMNGeoffrey Differential Revision: https://reviews.llvm.org/D126489	2022-05-27 10:22:44 -04:00
wren romano	2046e11ac4	[mlir][sparse] Improving ExecutionEngine/SparseTensorUtils.h This change makes the public API of SparseTensorUtils.cpp explicit, whereas before the publicity of these functions was only implicit. Implicit publicity is sufficient for mlir-opt to generate calls to these functions, but it's not enough to enable C/C++ code to call them directly in the usual way (i.e., without going through codegen). Thus, leaving the publicity implicit prevents development of other tools (e.g., microbenchmarks). In addition this change also marks the functions MLIR_CRUNNERUTILS_EXPORT, which is required by the JIT under certain configurations (albeit not for anything in our test suite). Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D126105	2022-05-26 17:22:08 -07:00
NAKAMURA Takumi	6f434776da	[bazel] Introduce "VE" CodeGen in LLVM.	2022-05-26 22:39:49 +09:00
Alex Zinenko	73c3dff1b3	[mlir] Use-after-free checker for the Transform dialect The Transform dialect uses the side effect modeling mechanism to record the effects of the transform ops on the mapping between Transform IR values and Payload IR ops. Introduce a checker pass that warns if a Transform IR value is used after it has been freed (consumed). This pass is mostly intended as a debugging aid in addition to the verification/assertion mechanisms in the transform interpreter. It reports all potential use-after-free situations. The implementation makes a series of simplifying assumptions to be simple and conservative. A more advanced implementation would rely on the data flow-like analysis associated with a side-effect resource rather than a value, which is currently not supported by the analysis infrastructure. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D126381	2022-05-26 12:28:41 +02:00
NAKAMURA Takumi	65ab6b495a	[bazel] Unset REVISION as if LLVM_APPEND_VC_REV=OFF, for now. We could implement retrieving the revision here, but we may avoid "Just the same but only different revision hash string".	2022-05-26 06:24:49 +09:00
NAKAMURA Takumi	801ac2ebf1	[bazel] Bump to 15.0.0git	2022-05-26 06:24:49 +09:00
Matthias Springer	210c4e7fc8	[mlir][bufferization] Fix Python bindings Differential Revision: https://reviews.llvm.org/D126179	2022-05-23 18:12:56 +02:00
Matthias Springer	ffdbecccaf	[mlir][bufferization] Add bufferization.alloc_tensor op This change adds a new op `alloc_tensor` to the bufferization dialect. During bufferization, this op is always lowered to a buffer allocation (unless it is "eliminated" by a pre-processing pass). It is useful to have such an op in tensor land, because it allows users to model tensor SSA use-def chains (which drive bufferization decisions) and because tensor SSA use-def chains can be analyzed by One-Shot Bufferize, while memref values cannot. This change also replaces all uses of linalg.init_tensor in bufferization-related code with bufferization.alloc_tensor. linalg.init_tensor and bufferization.alloc_tensor are similar, but the purpose of the former one is just to carry a shape. It does not indicate a memory allocation. linalg.init_tensor is not suitable for modelling SSA use-def chains for bufferization purposes, because linalg.init_tensor is marked as not having side effects (in contrast to alloc_tensor). As such, it is legal to move linalg.init_tensor ops around/CSE them/etc. This is not desirable for alloc_tensor; it represents an explicit buffer allocation while still in tensor land and such allocations should not suddenly disappear or get moved around when running the canonicalizer/CSE/etc. BEGIN_PUBLIC No public commit message needed for presubmit. END_PUBLIC Differential Revision: https://reviews.llvm.org/D126003	2022-05-21 02:47:32 +02:00
Dmitri Gribenko	30628b0ecc	Use the public clang::Builtin API in the unit test	2022-05-20 18:09:09 +02:00
Dmitri Gribenko	cf31db35a7	Adjust BUILD files for [ifs] Switch to using OptTable	2022-05-20 18:08:29 +02:00
Dmitri Gribenko	73bd60b843	Adjust BUILD files for [MLIR][GPU] Add NvGpu mma.sync path to the VectorToGPU pass	2022-05-20 18:03:20 +02:00
Aart Bik	28b6d412af	[mlir][sparse] add support for complex zero/one building Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D126039	2022-05-20 08:53:30 -07:00
Guillaume Chatelet	0443bfabe7	Revert "[libc] Apply no-builtin everywhere, remove unnecessary flags" This reverts commit `94d6dd9057`.	2022-05-20 14:37:17 +00:00
Alex Brachet	c3856cb739	[bazel][libc] Fix bazel build Differential revision: https://reviews.llvm.org/D126028	2022-05-19 22:58:50 +00:00
Jorge Gorbe Moya	221b7a4583	[bazel] Add lib/Basic/BuiltinTargetFeatures.h to clang:basic `hdrs`. This header is included by clang/unittests/CodeGen/CheckTargetFeaturesTest.cpp so it needs to be exposed here to make it visible.	2022-05-19 14:20:17 -07:00
Guillaume Chatelet	94d6dd9057	[libc] Apply no-builtin everywhere, remove unnecessary flags Some functions like `stpncpy` are implemented in terms of `memset` but are not currently using `-fno-builtin-memset`. This is somewhat hidden by the fact that we use `-ffreestanding` globally and that `-ffreestanding` implies `-fno-builtin` for Clang. This patch also removes `-mllvm -combiner-global-alias-analysis` that is Clang specific and that does not bring substantial gains on modern processors. Also we keep `-mllvm --tail-merge-threshold=0` for aarch64 in CMakeLists.txt but we omit it in the Bazel config. This is because Bazel consumes the source files directly and so it can use PGO to take optimal decisions locally. Differential Revision: https://reviews.llvm.org/D125894	2022-05-19 09:08:42 +00:00
Stella Laurenzo	8b7e85f4f8	[mlir][python] Add Python bindings for ml_program dialect. Differential Revision: https://reviews.llvm.org/D125852	2022-05-18 23:08:33 -07:00
Stella Laurenzo	2bb252852c	[mlir] Add GlobalOp, GlobalLoadConstOp to ml_program. The approach I took was to define a dialect 'extern' attribute that a GlobalOp can take as a value to signify external linkage. I think this approach should compose well and should also work with wherever the OpaqueElements work goes in the future (since that is just another kind of attribute). I special cased the GlobalOp parser/printer for this case because it is significantly easier on the eyes. In the discussion, Jeff Niu had proposed an alternative syntax for GlobalOp that I ended up not taking. I did try to implement it but a) I don't think it made anything easier to read in the common case, and b) it made the parsing/printing logic a lot more complicated (I think I would need a completely custom parser/printer to do it well). Please have a look at the common cases where the global type and initial value type match: I don't think how I have it is too bad. The less common cases seem ok to me. I chose to only implement the direct, constant load op since that is non side effecting and there was still discussion pending on that. Differential Revision: https://reviews.llvm.org/D124318	2022-05-18 23:08:28 -07:00
Benjamin Kramer	e497871356	[mlir][complex] Add pow/sqrt/tanh ops and lowering to libm Lowering through libm gives us a baseline version, even though it's not going to be particularly fast. This is similar to what we do for some math dialect ops. Differential Revision: https://reviews.llvm.org/D125550	2022-05-18 14:03:14 +02:00
Aart Bik	736c1b66ef	[mlir][sparse] introduce complex type to sparse tensor support This is the first implementation of complex (f64 and f32) support in the sparse compiler, with complex add/mul as first operations. Note that various features are still TBD, such as other ops, and reading in complex values from file. Also, note that the std::complex<float> had a bit of an ABI issue when passed as single argument. It is still TBD if better solutions are possible. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D125596	2022-05-16 13:17:36 -07:00
Benjamin Kramer	cc88212d81	[bazel] Port `ae8bbc43f4`	2022-05-14 12:12:20 +02:00
Tres Popp	1dce51b888	[mlir] Add TensorToLinalgPass This pass is to handle computationally complex operations like tensor.pad which are not simply lowered to the exact same operation in the memref dialect. Differential Revision: https://reviews.llvm.org/D125384	2022-05-13 12:17:22 +02:00
Michael Jones	dd7f30464b	[libc] fix uint includes and libc bazel This patch fixes the includes for the new UInt class so that the api test now passes, additionally it fixes the bazel files to account for the new dependencies. Differential Revision: https://reviews.llvm.org/D125490	2022-05-12 11:40:52 -07:00
Thomas Raoux	d02f10d96d	[mlir][vector] Add lowering pattern for vector.warp_execute_on_lane_0 op Add lowering of the vector.warp_execute_on_lane_0 into scf.if plus memory transfer for the operands and yield values. This also add an integration test running on GPU warp. The same tests can be later re-used with different comment lines to tests distribution transformations. This is mostly from @springerm contribution. Differential Revision: https://reviews.llvm.org/D125430	2022-05-12 13:27:43 +00:00
Benjamin Kramer	303638248a	[mlir][linalg] Add lowering of named ops on complex numbers This lets linalg.dot and friends lower to a complex muladd using ops from the complex dialect. Differential Revision: https://reviews.llvm.org/D125461	2022-05-12 13:37:34 +02:00
Benjamin Kramer	ca6cbbe8d0	[bazel] Add support for configuring the bazel build for PPC TF already carries a patch for this.	2022-05-12 12:04:14 +02:00
Vibhuti Sawant	6b6e796b74	[Bazel] Add support for s390x build target While executing the test suite for Tensorflow(v2.8.0), we encountered multiple TC failures with the below error ``` 'z14' is not a recognized processor for this target ``` This patch adds the s390x target to the build target list. It fixes TC failures in multiple modules of Tensorflow on s390x arch. It is also tested to have no effect on x86 machines. Reviewed By: GMNGeoffrey Differential Revision: https://reviews.llvm.org/D125096	2022-05-11 09:23:16 -07:00
Thomas Raoux	15bcc36eed	[mlir][gpu] Move async copy ops to NVGPU and add caching hints Move async copy operations to NVGPU as they only exist on NV target and are designed to match ptx semantic. This allows us to also add more fine grain caching hint attribute to the op. Add hint to bypass L1 and hook it up to NVVM op. Differential Revision: https://reviews.llvm.org/D125244	2022-05-10 22:30:24 +00:00
Krzysztof Drewniak	814b605095	[mlir][AMDGPU] Add AMDGPU conversion patterns to ConvertGPUToROCDL This ensures that attributes such as the index bitwidth propagate correctly to the AMDGPUToROCDL patterns. Differential Revision: https://reviews.llvm.org/D125320	2022-05-10 16:49:11 +00:00
Krzysztof Drewniak	f1f05a91ca	[MLIR][AMDGPU] Add AMDGPU dialect, wrappers around raw buffer intrinsics By analogy with the NVGPU dialect, introduce an AMDGPU dialect for AMD-specific intrinsic wrappers. The dialect initially includes wrappers around the raw buffer intrinsics. On AMD GPUs, a memref can be converted to a "buffer descriptor" that allows more precise control of memory access, such as by allowing for out of bounds loads/stores to be replaced by 0/ignored without adding additional conditional logic, which is important for performance. The repository currently contains a limited conversion from transfer_read/transfer_write to Mubuf intrinsics, which are an older, deprecated intrinsic for the same functionality. The new amdgpu.raw_buffer_* ops allow these operations to be used explicitly and for including metadata such as whether the target chipset is an RDNA chip or not (which impacts the interpretation of some bits in the buffer descriptor), while still maintaining an MLIR-like interface. (This change also exposes the floating-point atomic add intrinsic.) Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D122765	2022-05-10 14:59:58 +00:00
Benjamin Kramer	9eccc7357e	[bazel] Fix the build after `2c33266084`	2022-05-03 23:04:10 +02:00
Eric Li	aaddfbf9d6	[bazel] Add test targets for dataflow framework Differential Revision: https://reviews.llvm.org/D124819	2022-05-03 14:09:10 +00:00
Matthias Springer	3c2a74a3ae	[mlir][linalg][transform] Add TileOp to transform dialect This commit adds a tiling op to the transform dialect as an external op. Differential Revision: https://reviews.llvm.org/D124661	2022-04-29 21:35:31 +09:00
Benjamin Kramer	b8d4fe0f0a	[bazel] Port `92a836da07`	2022-04-28 22:51:27 +02:00
Benjamin Kramer	1fbdf3a02e	[bazel] Port `84fe39a45b`	2022-04-28 18:29:43 +02:00
Stephan Herhut	c10bbc20bc	[mlir][bazel] Add suport for PDLL tests. Differential Revision: https://reviews.llvm.org/D124515	2022-04-27 12:35:12 +02:00
Adrian Kuegel	c2a8490193	[mlir][Bazel] Add missing dependencies. When building with layering_check enabled, there needs to be a dependency for each header include.	2022-04-25 09:24:07 +02:00
Fangrui Song	bbbc49f780	[Bazel] Add more mlir dependencies after D124298 The Bazel layering_check feature compiles libraries with `-fmodule-name=X -fmodules-strict-decluse` which require #include to be in deps.	2022-04-23 13:06:54 -07:00
Fangrui Song	6c8612fe6f	[Bazel] Make mlir:BufferizationDialect depend on mlir:ArithmeticDialect after D124298	2022-04-23 12:59:37 -07:00
Alex Zinenko	40a8bd635b	[mlir] use side effects in the Transform dialect Currently, the sequence of Transform dialect operations only supports a single use of each operand (verified by the `transform.sequence` operation). This was originally motivated by the need to guard against accessing a payload IR operation associated with a transform IR value after this operation has likely been rewritten by a transformation. However, not all Transform dialect operations rewrite payload IR, in particular the "navigation" operation such as `transform.pdl_match` do not. Introduce memory effects to the Transform dialect operations to describe their effect on the payload IR and the mapping between payload IR opreations and transform IR values. Use these effects to replace the single-use rule, allowing repeated reads and disallowing use-after-free, where operations with the "free" effect are considered to "consume" the transform IR value and rewrite the corresponding payload IR operations). As an additional improvement, this enables code motion transformation on the transform IR itself. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D124181	2022-04-22 23:29:11 +02:00
Fangrui Song	baebe12ad0	[Bazel] Make mlir/test:TestShapeDialect depend on mlir:FuncDialect	2022-04-22 13:55:27 -07:00
Jacques Pienaar	9bae20b528	[mlir] Add shape.func Add shape func op for use (primarily) in shape function_library op. Allows setting default dialect for some simpler authoring. This is a minimal version of the ops needed. Differential Revision: https://reviews.llvm.org/D124055	2022-04-22 11:35:35 -07:00
Matthias Springer	e07a7fd5c0	[mlir][bufferization] Move ModuleBufferization to bufferization dialect * Move Module Bufferization to the bufferization dialect. The implementation is split into `OneShotModuleBufferize.cpp` and `FuncBufferizableOpInterfaceImpl.cpp`, so that the external model implementation can be easily moved to the func dialect in the future. * Split and clean up test cases. A few test cases are still remaining in Linalg and will be updated separately. * `linalg.inplaceable` is renamed to `bufferization.writable` to accurately reflect its current usage. * Attributes and their verifiers are moved from the Linalg dialect to the Bufferization dialect. * Expand documentation. * Add a new flag to One-Shot Bufferize to allow for function boundary bufferization. Differential Revision: https://reviews.llvm.org/D122229	2022-04-22 19:37:28 +09:00
Sam McCall	0d43614df1	[bazel] try to adapt `a7691dee2d`, again	2022-04-21 22:33:40 +02:00
Sam McCall	f595b51f50	[bazel] try to adapt `a7691dee2d`	2022-04-21 22:09:12 +02:00
Alex Zinenko	30f22429d3	[mlir] Connect Transform dialect to PDL This introduces a pair of ops to the Transform dialect that connect it to PDL patterns. Transform dialect relies on PDL for matching the Payload IR ops that are about to be transformed. For this purpose, it provides a container op for patterns, a "pdl_match" op and transform interface implementations that call into the pattern matching infrastructure. To enable the caching of compiled patterns, this also provides the extension mechanism for TransformState. Extensions allow one to store additional information in the TransformState and thus communicate it between different Transform dialect operations when they are applied. They can be added and removed when applying transform ops. An extension containing a symbol table in which the pattern names are resolved and a pattern compilation cache is introduced as the first client. Depends On D123664 Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D124007	2022-04-21 16:23:10 +02:00
Chuanqi Xu	483efc9ad0	[Pipelines] Remove Legacy Passes in Coroutines The legacy passes are deprecated now and would be removed in near future. This patch tries to remove legacy passes in coroutines. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D123918	2022-04-21 10:59:11 +08:00
Benjamin Kramer	ff087d705c	[bazel] Port `f26c41e8dd`	2022-04-20 16:15:16 +02:00
Alex Zinenko	0eb403ad1b	[mlir][transform] Introduce transform.sequence op Sequence is an important transform combination primitive that just indicates transform ops being applied in a row. The simplest version requires fails immediately if any transformation in the sequence fails. Introducing this operation allows one to start placing transform IR within other IR. Depends On D123135 Reviewed By: Mogball, rriddle Differential Revision: https://reviews.llvm.org/D123664	2022-04-19 21:41:02 +02:00
Mogball	fa26c7ff4b	[mlir] Refactor LICM into a utility LICM is refactored into a utility that is application on any region. The implementation is moved to Transform/Utils.	2022-04-16 00:37:07 +00:00
Stella Stamenova	353f0a8e43	Revert "[mlir] Refactor LICM into a utility" This reverts commit `3131f80824`. This commit broke the Windows mlir bot: https://lab.llvm.org/buildbot/#/builders/13/builds/19745	2022-04-15 17:09:16 -07:00
Mogball	3131f80824	[mlir] Refactor LICM into a utility LICM is refactored into a utility that is application on any region. The implementation is moved to Transform/Utils.	2022-04-15 22:07:01 +00:00
rdzhabarov	3ef4099a61	[mlir] Fix BUILD issues and dependencies. Differential Revision: https://reviews.llvm.org/D123868	2022-04-15 19:05:02 +00:00
Dmitri Gribenko	61bd985f2a	Adjust Bazel BUILD files for `6d45558c1`	2022-04-15 15:45:18 +02:00
Thomas Raoux	fa2762a251	[mlir] Update bazel file after adding nvgpu to nvvm conversion	2022-04-15 04:29:32 +00:00
Thomas Raoux	59058c441a	[mlir][vector] Add operations used for Vector distribution Add vector op warp_execute_on_lane_0 that will be used to do incremental vector distribution in order to target warp level vector programming for architectures with GPU-like SIMT programming model. The idea behing the op is discussed further on discourse: https://discourse.llvm.org/t/vector-vector-distribution-large-vector-to-small-vector/1983/23 Differential Revision: https://reviews.llvm.org/D123703	2022-04-15 03:47:52 +00:00
Thomas Raoux	4c564940a1	[mlir][nvgpu] Add NVGPU dialect (architectural specific gpu dialect) This introduce a new dialect for vendro specific ptx operations. This also adds the first operation ldmatrix as an example. More operations will be added in follow up patches. This new dialect is meant to be a bridge between GPU and Vector dialectis and NVVM dialect. This is based on the RFC proposed here: https://discourse.llvm.org/t/rfc-add-nv-gpu-dialect-hw-specific-extension-of-gpu-dialect-for-nvidia-gpus/61466/8 Differential Revision: https://reviews.llvm.org/D123266	2022-04-14 16:33:46 +00:00
Alex Zinenko	d064c4801c	[mlir] Introduce Transform dialect This dialect provides operations that can be used to control transformation of the IR using a different portion of the IR. It refers to the IR being transformed as payload IR, and to the IR guiding the transformation as transform IR. The main use case for this dialect is orchestrating fine-grain transformations on individual operations or sets thereof. For example, it may involve finding loop-like operations with specific properties (e.g., large size) in the payload IR, applying loop tiling to those and only those operations, and then applying loop unrolling to the inner loops produced by the previous transformations. As such, it is not intended as a replacement for the pass infrastructure, nor for the pattern rewriting infrastructure. In the most common case, the transform IR will be processed and applied to payload IR by a pass. Transformations expressed by the transform dialect may be implemented using the pattern infrastructure or any other relevant MLIR component. This dialect is designed to be extensible, that is, clients of this dialect are allowed to inject additional operations into this dialect using the newly introduced in this patch `TransformDialectExtension` mechanism. This allows the dialect to avoid a dependency on the implementation of the transformation as well as to avoid introducing dialect-specific transform dialects. See https://discourse.llvm.org/t/rfc-interfaces-and-dialects-for-precise-ir-transformation-control/60927. Reviewed By: nicolasvasilache, Mogball, rriddle Differential Revision: https://reviews.llvm.org/D123135	2022-04-14 13:48:45 +02:00
Alex Zinenko	09141f1adf	[mlir] Split intrinsics out of LLVMOps.td Move the operations that correspond to LLVM IR intrinsics in a separate .td file. This makes it easier to maintain the intrinsics and decreases the compile time of LLVMDialect.cpp by ~25%. Depends On D123310 Reviewed By: wsmoses, jacquesguan Differential Revision: https://reviews.llvm.org/D123315	2022-04-14 13:23:31 +02:00
Stella Laurenzo	61352a580a	[mlir] Introduce ml_program dialect. Differential Revision: https://reviews.llvm.org/D120203	2022-04-13 21:38:14 -07:00
Arthur Eubanks	08bd7d557c	[bazel] Set CLANG_ENABLE_OPAQUE_POINTERS_INTERNAL to 1 Matches official cmake build.	2022-04-13 16:31:52 -07:00
Dmitri Gribenko	e67b90bdb3	Update the Bazel build files for "[mlir][Math] Replace some constant ..."	2022-04-12 13:47:51 +02:00
rdzhabarov	64d3e163d8	Fix BUILD dependency for ExecutionEngineUtils Differential Revision: https://reviews.llvm.org/D123570	2022-04-12 04:49:16 +00:00
Eugene Zhulenev	b35b9e307f	[mlir] Add msan memory unpoisoning macros to mlir ExecutionEngine Adding annotations on as-needed bases, currently only for memrefCopy, but in general all C API functions that take pointers to memory allocated/initialized inside the jit-compiled code must be annotated, to be able to run with msan. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D123557	2022-04-11 18:58:28 -07:00
rdzhabarov	e4c4d0d298	Fixing BUILD dependency on the DialectBase. Differential Revision: https://reviews.llvm.org/D123558	2022-04-12 01:22:15 +00:00
Fangrui Song	a8ef1647aa	[CMake][gn][Bazel] Remove HAVE_PTHREAD_GETSPECIFIC The only user was removed by `d351f54a07`.	2022-04-11 14:44:45 -07:00
Fangrui Song	ca68038d12	Reland "[Driver] Default CLANG_DEFAULT_PIE_ON_LINUX to ON"" (With C++ exceptions, `clang++ --target=mips64{,el}-linux-gnu -fpie -pie -fuse-ld=lld` has link errors (lld does not implement some strange R_MIPS_64 .eh_frame handling in GNU ld). However, sanitizer-x86_64-linux-qemu used this to build ScudoUnitTests. Pined ScudoUnitTests to -no-pie.) Default the option introduced in D113372 to ON to match all(?) major Linux distros. This matches GCC and improves consistency with Android and linux-musl which always default to PIE. Note: CLANG_DEFAULT_PIE_ON_LINUX may be removed in the future. Differential Revision: https://reviews.llvm.org/D120305	2022-04-08 23:40:18 -07:00
Jorge Gorbe Moya	ac1235dda6	Fix bazel rule for __support_fputil_fma when using header modules. Putting __support/FPUtil/x86_64/FMA.h in `hdrs` will trigger a compilation action for that header, and it will always `#error` out for non-FMA targets. Move these platform-specific headers that are conditionally included to `textual_hdrs` instead.	2022-04-08 16:28:31 -07:00
Tue Ly	c5f8a0a1e9	[libc] Add support for x86-64 targets that do not have FMA instructions. Make FMA flag checks more accurate for x86-64 targets, and refactor polyeval to use multiply and add instead when FMA instructions are not available. Reviewed By: michaelrj, sivachandra Differential Revision: https://reviews.llvm.org/D123335	2022-04-08 14:12:24 -04:00
Nico Weber	e22a60b1c8	Revert "Reland "[Driver] Default CLANG_DEFAULT_PIE_ON_LINUX to ON""" This reverts commit `2aca33baf1`. Broke tests on several bots, see comments on https://reviews.llvm.org/D120305	2022-04-07 10:07:07 -04:00
Benjamin Kramer	6b15157610	[bazel] Port `3031fa88f0`	2022-04-07 12:14:14 +02:00
Benjamin Kramer	b8acaaa0c7	[bazel] Port `5390606aa9`	2022-04-07 10:40:55 +02:00
Fangrui Song	2aca33baf1	Reland "[Driver] Default CLANG_DEFAULT_PIE_ON_LINUX to ON"" (The upgrade of the ppc64le bot and D121257 have fixed compiler-rt failures. Tested by nemanjai.) Default the option introduced in D113372 to ON to match all(?) major Linux distros. This matches GCC and improves consistency with Android and linux-musl which always default to PIE. Note: CLANG_DEFAULT_PIE_ON_LINUX may be removed in the future. Differential Revision: https://reviews.llvm.org/D120305	2022-04-06 20:19:07 -07:00
Nikita Popov	ed4e6e0398	[cmake] Remove LLVM_ENABLE_NEW_PASS_MANAGER cmake option Or rather, error out if it is set to something other than ON. This removes the ability to enable the legacy pass manager by default, but does not remove the ability to explicitly enable it through various flags like -flegacy-pass-manager or -enable-new-pm=0. I checked, and our test suite definitely doesn't pass with LLVM_ENABLE_NEW_PASS_MANAGER=OFF anymore. Differential Revision: https://reviews.llvm.org/D123126	2022-04-06 09:52:21 +02:00
Jorge Gorbe Moya	54cc7de4bc	Fix bazel build. - https://reviews.llvm.org/D122619 bumped zlib version but didn't change the hash - Added new header from https://reviews.llvm.org/D108438	2022-04-05 15:45:53 -07:00
Benjamin Kramer	fe11344257	[bazel] Try to fix the build after `4661a65f4b`	2022-04-05 14:40:52 +02:00
Alexander Belyaev	004d4f8980	[mlir] Update BUILD.bazel.	2022-04-05 11:25:40 +02:00
Alexander Belyaev	cc6788aece	[mlir] Update mlir/BUILD.bazel.	2022-04-05 10:14:41 +02:00
Rob Suderman	126e7eaf0d	[tosa] Add option to disable tosa.apply_scale lowering in TosaToStandard Apply scale should be optionally disabled when lowering via TosaToStandard. In most cases it should persist until the lowering to specific backend. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D122948	2022-04-04 12:22:12 -07:00

1 2 3 4 5 ...

503 Commits