llvm-project

Commit Graph

Author	SHA1	Message	Date
Stanley Winata	026fac2a14	[mlir][linalg] Vectorization for conv_1d_ncw_fcw Most computer vision torch models uses nchw/ncw convolution. In a previous patch we added decomposition conv2dNchw to conv1dNcw. To enhance the performance on torch models we add this vectorization pattern for conv1dNcw which would consquently also improve the performance on conv2dNchw. On IREE + Intel Xeon 8360 + Resnet50, we were able to get ~7x speed up ~880ms to 126ms. Reviewed By: nicolasvasilache, hanchung Differential Revision: https://reviews.llvm.org/D133675	2022-09-14 11:07:53 -07:00
Piotr Sobczak	abd927e5a8	[AMDGPU] Check for num elts in SelectVOP3PMods The rest of the code section assumes there are exactly two elements in the vector (Lo, Hi), so add the check before entering the section. Differential Revision: https://reviews.llvm.org/D133852	2022-09-14 20:00:19 +02:00
Julius	49e7ef2c09	[Clang]: Diagnose deprecated copy operations also in MSVC compatibility mode When running in MSVC compatibility mode, previously no deprecated copy operation warnings (enabled by -Wdeprecated-copy) were raised. This restriction was already in place when the deprecated copy warning was first introduced. This patch removes said restriction so that deprecated copy warnings, if enabled, are also raised in MSVC compatibility mode. The reasoning here being that these warnings are still useful when running in MSVC compatibility mode and also have to be semi-explicitly enabled in the first place (using -Wdeprecated-copy, -Wdeprecated or -Wextra). Differential Revision: https://reviews.llvm.org/D133354	2022-09-14 19:48:08 +02:00
Florian Hahn	7f3ff9d3c0	[ConstraintElimination] Track if variables are positive in constraint. Keep track if variables are known positive during constraint decomposition, aggregate the information when building the constraint object and encode the extra information as constraints to be used during reasoning.	2022-09-14 18:43:54 +01:00
Thomas Raoux	4abb9e5d20	[mlir][vector] Clean up and generalize lowering of warp_execute to scf Simplify the lowering of warp_execute_on_lane0 of scf.if by making the logic more generic. Also remove the assumption that the most inner dimension is the dimension distributed. Differential Revision: https://reviews.llvm.org/D133826	2022-09-14 17:36:16 +00:00
Matt Arsenault	c9ef7d49ab	llvm-reduce: Do not insert replacement IMPLICIT_DEFs for dead defs Also skip dead defs when looking for a previous vreg with the same class. This helps avoid some mid-reduction verifier errors when LiveIntervals computation starts introducing dead flags everywhere.	2022-09-14 13:21:14 -04:00
Matt Arsenault	0e1ee738f1	llvm-reduce: Restrict test to only test relevant reductions Avoids breaking this test in a future change.	2022-09-14 13:21:01 -04:00
Joseph Huber	23bc343855	[Libomptarget] Change device free routines to accept the allocation kind Previous support for device memory allocators used a single free routine and did not provide the original kind of the allocation. This is problematic as some of these memory types required different handling. Previously this was worked around using a map in runtime to record the original kind of each pointer. Instead, this patch introduces new free routines similar to the existing allocation routines. This allows us to avoid a map traversal every time we free a device pointer. The only interfaces defined by the standard are `omp_target_alloc` and `omp_target_free`, these do not take a kind as `omp_alloc` does. The standard dictates the following: "The omp_target_alloc routine returns a device pointer that references the device address of a storage location of size bytes. The storage location is dynamically allocated in the device data environment of the device specified by device_num." Which suggests that these routines only allocate the default device memory for the kind. So this has been changed to reflect this. This change is somewhat breaking if users were using `omp_target_free` as previously shown in the tests. Reviewed By: JonChesterfield, tianshilei1992 Differential Revision: https://reviews.llvm.org/D133053	2022-09-14 12:14:07 -05:00
Nico Weber	5631d20bfc	Revert "[clang] fix generation of .debug_aranges with LTO" This reverts commit `6bf6730ac5`. Breaks tests if LLD isn't being built, see comments on https://reviews.llvm.org/D133092	2022-09-14 12:43:24 -04:00
revunov.denis@huawei.com	553c238952	[BOLT] Preserve original LSDA type encoding In non-pie binaries BOLT unconditionally converted type encoding from indirect to absptr, which broke std exceptions since pointers to their typeinfo were only assigned at runtime in .data section. In this patch we preserve original encoding so that indirect remains indirect and can be resolved at runtime, and absolute remains absolute. Reviewed By: rafauler, maksfb Differential Revision: https://reviews.llvm.org/D132484	2022-09-14 16:33:47 +00:00
Ashay Rane	f1848b0a0e	[clang] fix linker executable path in test A previous patch (https://reviews.llvm.org/D132810) introduced a test that fails on systems where the linker executable (`ld`) has a `.exe` extension. This patch updates the regex in the test so that lit can look for both `ld` as well as `ld.exe`. Reviewed By: stella.stamenova Differential Revision: https://reviews.llvm.org/D133773	2022-09-14 11:35:37 -05:00
Stella Stamenova	da459043f8	Revert "[lldb][DWARF5] Enable macro evaluation" This reverts commit `a0fb69d17b`. This broke the windows lldb bot: https://lab.llvm.org/buildbot/#/builders/83/builds/23666	2022-09-14 09:30:49 -07:00
Nico Weber	db6a53450f	Revert "[test][clang] run test for lld emitting dwarf-aranages only if lld is presented" This reverts commit `44075cc34a`. Broke check-clang, see comments on https://reviews.llvm.org/D133841	2022-09-14 12:17:41 -04:00
Eman Copty	54bd8bb452	[mlir] Add accessor methods for I[2\|4\|16] types to Builder. Adds the accessor methods for I[2\|4\|16] types to the Builder. Differential Revision: https://reviews.llvm.org/D133793	2022-09-14 09:06:00 -07:00
Peiming Liu	55a1d50fb9	[mlir][sparse] Make sparse compiler more admissible. Previously, the iteration graph is computed without priority. This patch add a heuristic when computing the iteration graph by starting with Reduction iterator when doing topo sort, which makes Reduction iterators (likely) appear as late in the sorted array as possible. The current sparse compiler also failed to compile the newly added case. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D133738	2022-09-14 15:59:47 +00:00
Nicolas Vasilache	e479aecd56	Revert "[mlir][scf][Transform] Refactor transform.fuse_into_containing_op so it is iterative and supports output fusion." This reverts commit `54a5f60628` which is a WIP that was pushed by mistake.	2022-09-14 08:51:30 -07:00
Nicolas Vasilache	54a5f60628	[mlir][scf][Transform] Refactor transform.fuse_into_containing_op so it is iterative and supports output fusion. This revision revisits the implementation of `transform.fuse_into_containing_op` so that it iterates on producers one use at a time. Support is added to fuse a producer through a foreach_thread shared tensor argument, in which case we tile and fuse the op inside the containing op and update the shared tensor argument to the unique destination operand. If one cannot find such a unique destination operand the transform fails.	2022-09-14 08:50:32 -07:00
Nicolas Vasilache	593c14d422	[mlir][Linalg] Add return type filter to the transform dialect This allows matching ops by additionally providing an idiomatic spec for a unique return type. Differential Revision: https://reviews.llvm.org/D133862	2022-09-14 08:50:31 -07:00
Alexey Bataev	d647312e3f	[SLP][NFC]Extract getLastInstructionInBundle function for better dependence checking, NFC. Part of D110978	2022-09-14 08:43:15 -07:00
Jeff Niu	3108249dea	[MLIR][math] Use approximate matches for folded ops LibM implementations differ, so the folders can have different results on different platforms. For instance, the `cos` folder was failing on M1 mac. I chose to match the constant floats to 2(.5) significant digits. Reviewed By: jacquesguan Differential Revision: https://reviews.llvm.org/D133797	2022-09-14 08:39:41 -07:00
Groverkss	b696e25a7a	[MLIR][Presburger] Add hermite normal form computation to Matrix This patch adds hermite normal form computation to Matrix. Part of this algorithm lived in LinearTransform, being used for compuing column echelon form. This patch moves the implementation to Matrix::hermiteNormalForm and generalises it to compute the hermite normal form. Reviewed By: arjunp Differential Revision: https://reviews.llvm.org/D133510	2022-09-14 16:39:05 +01:00
Mats Petersson	9067de2a43	[flang][driver]Fix broken PowerPC tests Tests don't work on PPC since `return` instruciton is't called `ret` (apparently) Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D133859	2022-09-14 16:33:38 +01:00
Zain Jaffal	8253f7e286	[InstCombine] Optimize multiplication where both operands are negated Handle the case where both operands are negated in matrix multiplication Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D133695	2022-09-14 16:29:39 +01:00
Haojian Wu	f6e759bd26	Remove some unused static functions in CGOpenMPRuntimeGPU.cpp, NFC	2022-09-14 17:20:02 +02:00
David Spickett	3acaf04033	[LLVM][AArch64] Don't warn about clobbering X16 when Speculative Load Hardening is used SLH will fall back to a different technique if X16 is being used, so there is no need to warn for inline asm use. Only prevent other codegen from using it. Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D133766	2022-09-14 15:19:53 +00:00
Joseph Huber	bae1a2cf3c	[OpenMP] Remove unused function after removing simplified interface Summary: A previous patch removed the user of this function but did not remove the function causing unused function warnings. Remove it.	2022-09-14 10:14:43 -05:00
John Ericson	154db06ce0	[CMake] Avoid `LLVM_BINARY_DIR` when other more specific variable are better-suited, part 1 A simple sed doing these substitutions: - `${LLVM_BINARY_DIR}/\$\{CMAKE_CFG_INTDIR}/lib(${LLVM_LIBDIR_SUFFIX})?\>` -> `${LLVM_LIBRARY_DIR}` - `${LLVM_BINARY_DIR}/\$\{CMAKE_CFG_INTDIR}/bin\>` -> `${LLVM_TOOLS_BINARY_DIR}` where `\>` means "word boundary". The only manual modifications were reverting changes in - `compiler-rt/cmake/Modules/CompilerRTUtils.cmake` because these were "entry points" where we wanted to tread carefully not not introduce a "loop" which would end with an undefined variable being expanded to nothing. There are many more occurrences without `CMAKE_CFG_INTDIR`, but those are left for D132316 as they have proved somewhat tricky to fix. This hopefully increases readability overall, and also decreases the usages of `LLVM_LIBDIR_SUFFIX`, preparing us for D130586. Reviewed By: sebastian-ne Differential Revision: https://reviews.llvm.org/D133828	2022-09-14 10:58:47 -04:00
Arjun P	6d6f6c4d3f	[MLIR][Presburger] use arbitrary-precision arithmetic with MPInt instead of int64_t Only the main Presburger library under the Presburger directory has been switched to use arbitrary precision. Users have been changed to just cast returned values back to int64_t or to use newly added convenience functions that perform the same cast internally. The performance impact of this has been tested by checking test runtimes after copy-pasting 100 copies of each function. Affine/simplify-structures.mlir goes from 0.76s to 0.80s after this patch. Its performance sees no regression compared to its original performance at commit `18a06d4f3a` before a series of patches that I landed to offset the performance overhead of switching to arbitrary precision. Affine/canonicalize.mlir and SCF/canonicalize.mlir show no noticable difference, staying at 2.02s and about 2.35s respectively. Also, for Affine and SCF tests as a whole (no copy-pasting), the runtime remains about 0.09s on average before and after. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D129510	2022-09-14 15:47:41 +01:00
Balazs Benics	b8e1da0506	[analyzer] Initialize ShouldEmitErrorsOnInvalidConfigValue analyzer option Downstream users who doesn't make use of the clang cc1 frontend for commandline argument parsing, won't benefit from the Marshalling provided default initialization of the AnalyzerOptions entries. More about this later. Those analyzer option fields, as they are bitfields, cannot be default initialized at the declaration (prior c++20), hence they are initialized at the constructor. The only problem is that `ShouldEmitErrorsOnInvalidConfigValue` was forgotten. In this patch I'm proposing to initialize that field with the rest. Note that this value is read by `CheckerRegistry.cpp:insertAndValidate()`. The analyzer options are initialized by the marshalling at `CompilerInvocation.cpp:GenerateAnalyzerArgs()` by the expansion of the `ANALYZER_OPTION_WITH_MARSHALLING` xmacro to the appropriate default value regardless of the constructor initialized list which I'm touching. Due to that this only affects users using CSA as a library, without serious effort, I believe we cannot test this. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D133851	2022-09-14 16:45:44 +02:00
Joseph Huber	194ec844f5	[OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU Previously, we linked in the ROCm device libraries which provide math and other utility functions late. This is not stricly correct as this library contains several flags that are only set per-TU, such as fast math or denormalization. This patch changes this to pass the bitcode libraries per-TU using the same method we use for the CUDA libraries. This has the advantage that we correctly propagate attributes making this implementation more correct. Additionally, many annoying unused functions were not being fully removed during LTO. This lead to erroneous warning messages and remarks on unused functions. I am not sure if not finding these libraries should be a hard error. let me know if it should be demoted to a warning saying that some device utilities will not work without them. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D133726	2022-09-14 09:42:06 -05:00
Joseph Huber	2d26ecb1fb	[OpenMP] Remove simplified device runtime handling The old device runtime had a "simplified" version that prevented many of the runtime features from being initialized. The old device runtime was deleted in LLVM 14 and is no longer in use. Selectively deactivating features is now done using specific flags rather than the old technique. This patch simply removes the extra logic required for handling the old simple runtime scheme. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133802	2022-09-14 09:41:50 -05:00
Nikita Popov	b1cd393f9e	[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI) Currently, FunctionModRefBehavior tracks whether the function reads or writes memory (ModRefInfo) and which locations it can access (argmem, inaccessiblemem and other). This patch changes it to track ModRef information per-location instead. To give two examples of why this is useful: * D117095 highlights a weakness of ModRef modelling in the presence of operand bundles. For a memcpy call with deopt operand bundle, we want to say that it can read any memory, but only write argument memory. This would allow them to be treated like any other calls. However, we currently can't express this and have to say that it can read or write any memory. * D127383 would ideally be modelled as a separate threadid location, where threadid Refs outside pre-split coroutines can be ignored (like other accesses to constant memory). The current representation does not allow modelling this precisely. The patch as implemented is intended to be NFC, but there are some obvious opportunities for improvements and simplification. To fully capitalize on this we would also want to change the way we represent memory attributes on functions, but that's a larger change, and I think it makes sense to separate out the FunctionModRefBehavior refactoring. Differential Revision: https://reviews.llvm.org/D130896	2022-09-14 16:34:41 +02:00
Florian Hahn	efd3ec47d9	[ConstraintElimination] Clear new indices directly in getConstraint(NFC) Instead of checking if any of the new indices has a non-zero coefficient before using the constraint, do this directly when constructing the constraint.	2022-09-14 15:31:25 +01:00
Christian Sigg	5cff32b9f0	[MLIR] Fix toy lit substitutions The tools are called e.g. `toyc-ch1`, not `toy-ch1`. Add missing toyc-ch6/7. It turns out that the other substitutions are not needed more by specific circumstances rather than by design: The lit test exec root is set to build/mlir/test, which is where all the test tools are placed by CMake and we wouldn't need to substitute them at all. We shouldn't rely on this assumption though, because it will make things harder for standalone tests and other build systems. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D133842	2022-09-14 15:59:24 +02:00
Jordan Rupprecht	1f3def30ca	Fix heap-use-after-free when clearing DIEs in fission compile units. D131437 caused heap-use-after-free failures when testing TestCreateAfterAttach.py in asan mode, and "regular" crashes outside of asan. This appears to be due to a mismatch in a couple places where we choose to clear the DIEs. When we clear the DIE of a skeleton unit, we unconditionally clear the DIE of the DWO unit if it exists. However, `~ScopedExtractDIEs()` only looks at the skeleton unit when deciding to clear. If we decide to clear the skeleton unit because it is now unused, we end up clearing the DWO unit that _is_ used. This change adds a guard by checking `m_cancel_scopes` to prevent clearing the DWO unit. This is 100% reproducible by running TestCreateAfterAttach.py in asan mode, although it only seems to reproduce in our internal build, so no test case is added here. If someone has suggestions on how to write one, I can add it. Reviewed By: labath Differential Revision: https://reviews.llvm.org/D133790	2022-09-14 06:52:47 -07:00
Zain Jaffal	d1dec04d76	[AArch64] Disable nontemproal load for Big Endian The current code for generating nontemporal load outputs the wrong assembly for big endian architecture. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D133789	2022-09-14 14:49:55 +01:00
Sanjay Patel	73919a87e9	[InstCombine] try multi-use demanded bits folds for 'add' This patch enables a multi-use demanded bits fold (motivated by issue #57576): https://alive2.llvm.org/ce/z/DsZakh This mimics transforms that we already do on the single-use path. Originally, this patch did not include the last part to form a constant, but that can be removed independently to reduce risk. It's not clear what the effect of either change will be when viewed end-to-end. This is expected to be neutral or a slight win for compile-time. See the "add-demand2" series for experimental timing results: https://llvm-compile-time-tracker.com/?config=NewPM-O3&stat=instructions&remote=rotateright Differential Revision: https://reviews.llvm.org/D133788	2022-09-14 09:30:59 -04:00
Alexey Bataev	796af0c027	[SLP] Move getInsertIndex function, NFC. Part of D110978.	2022-09-14 06:22:52 -07:00
Mats Petersson	b36b27b3fc	[flang][driver]Fix broken flang-new mlir test The test was added as a .mlir file, and this extension is not in the lit.cfg.py, so it was never run. When running it, the file would produce an error, as semicolon is not an MLIR comment. This adds the extension and fixes the comment start by using C++ style comments. Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D133792	2022-09-14 14:16:31 +01:00
Zain Jaffal	244a6a76d9	[AArch64] Add nontemporal load tests for big endian. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D133765	2022-09-14 13:51:58 +01:00
Nikita Popov	1cfbbba15b	[AA] Remove unnecessary intersections from getModRefBehavior() (NFC) Intersection with other providers is performed by AAResults. Doing this here is both pointless and confusing.	2022-09-14 14:26:39 +02:00
Florian Hahn	f213128b29	[ConstraintElimination] Further de-compose operands of add operations. This simply extends the existing logic to look through adds and combine the components as done in other places already.	2022-09-14 12:00:32 +01:00
Simon Pilgrim	854a4595b6	[CostModel][X86] getArithmeticInstrCost - move GLM/SLM custom costs AFTER constant shift -> multiply canonicalization Corrects the shift by constant costs to better account for them being converted to multiples for lowering - which demonstrates that we should probably be trying harder NOT to convert these to multiplies for some CPUs (v4i32 in particular).	2022-09-14 11:46:26 +01:00
Simon Pilgrim	40ab7875f8	[CostModel][X86] Fix throughput costs for AVX512BW v32i16 shifts Fixes regression from `a931dbfbd3`	2022-09-14 11:18:23 +01:00
Pavel Labath	d079bf33de	[lldb] Enable (un-xfail) some dwarf tests for arm These are passing now that the relocation assertion has been removed in D132954. Relocations still remain unimplemented though, so it's possible this may start to fail due to unrelated changes. If that happens very often, we may just need to disable (skip) the test instead.	2022-09-14 11:35:16 +02:00
Florian Hahn	aba2085e52	[ConstraintElimination] Add tests where info from zext can be used.	2022-09-14 10:04:07 +01:00
Pavel Kosov	a0fb69d17b	[lldb][DWARF5] Enable macro evaluation Patch enables handing of DWARFv5 DW_MACRO_define_strx and DW_MACRO_undef_strx ~~~ OS Laboratory. Huawei RRI. Saint-Petersburg Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D130062	2022-09-14 11:32:07 +03:00
Marco Elver	4627a30acf	[MIR] Support printing and parsing pcsections Adds support for printing and parsing PC sections metadata in MIR. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D133785	2022-09-14 10:30:25 +02:00
Florian Hahn	62c1928437	[ConstraintElimination] Add tests for chained adds. Add test coverage for reasoning about chains of adds.	2022-09-14 09:27:18 +01:00
Azat Khuzhin	44075cc34a	[test][clang] run test for lld emitting dwarf-aranages only if lld is presented Fixes: https://reviews.llvm.org/D133092 CI: https://lab.llvm.org/buildbot/#/builders/109/builds/46592 Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D133841	2022-09-14 10:17:03 +02:00

... 4 5 6 7 8 ...

436174 Commits All Branches Search

436174 Commits

All Branches